The Journal of 
Experimental Education 


A periodical report of scientific investigations relating to child development, 
curriculum, learning, teaching, supervision, measurements, 
statistics, and experimental techniques. 








September 1961 








WISCONSIN STUDIES OF THE MEASUREMENT AND 
PREDICTION OF TEACHER EFFECTIVENESS 


A Summary of Investigations 








$7.50 A YEAR PUBLISHED QUARTERLY $1.90 A COPY 








Published by Dembar Publications, Inc., 
Madison 3, Wisconsin. 
Second class postage paid at Madison, Wisconsin 





EDITORIAL BOARD 


A. &. Barr, Chairman, Professor of Education, University of Wisconsin, Madison 6, Wis. 


Jacob O. 


H Education 
cra Hiiets = 
seapenaiat 


Research B 
niversity, Carbondale, Illinois. 
materials on university field 


R. O. Collier, Associate Pooteeer, College of Education, 
Universizy of M inneapolis. —— y = 
sponsible for materials on be nena am statisti a 

methods of experimental research, published each ee, 


Teachers Col- 
ity. Editorially 
ec, and 


South- 
torially 
research. 


a Jersild, Professor of Educa 
ege, Columbia University, New York 
pn A for ri te poe | on wel 
development, published each 


H. H. Remmers, Professor of Educational Psychology, 
Director Division Educational Reference, Purdue Univer- 
sity, py hem Indiana. Editorially responsible for mate- 
rials on learning, teaching and supervision, published 
each September. 


. Wayne Wrightstone, Director, Bureau of Educational 
Research, Board of Education of the City of New York, 
Brooklyn, New York, 110 ae Street, Brooklyn, 
New York. Editorially se ble for materials on cur- 
ticulum construction, publ ished each June. 


CONTRIBUTING EDITORS 


A. Betts, Director, Betts Reading Clinic, Haver- 
ford, Pennsylva. . 


Leo J. Brueckner, Professor Emeritus, University of Minn- 
as: Minneapolis, Minnesota. 


Oscar K. Buros, Professor of Educa Rutg The 
State University, New Brunswick, New ua 
American Educa- 


T. Buswell, Executive Secr 
— ixteenth St., N.W., 


onal wag jation, 1201 
Washington 6, D 


Harold D. Carter, io ag of Education, University of 
California, Berkeley 4, California. 


Leslie L. Chisho 


Associate Professor of Education, 
State College of 


ashington, Pullman, Washington. 


Herbert 8. Conrad, Technical Consultant, College Entrance 
Examination Board, Princeton, New Jersey. 


Syste Corey, Professor of Education, Teachers Col- 
lege, Columbia University, New York, New York. 


Robert 
George 
essee. 


A. Davis, Professor of Educational Research, 
Peabody College for Teachers, Nashville, Tenn- 


Harl R. Douglass, Director Emeritus of College of Edu- 
cation, Galveruity pt Colorado, Boulder, Colorado, 


Harold A. Ed Director, Occu 
Service, Professor of Psychology, 
Columbus 10, Ohio. 


John C. Flan 
Pittsburgh, ean 
Dean, College of Education and Home 


Carter V. Good, 
Economics, Teachers’ College, University of Cincinnati, 
Cincinnati’ 21, Ohio. 


tional Op 
hio State 


rtunities 
niversity, 


Professor of Psychology, University of 
sylvania. 


Robert W. ¥. joie, Beotentas of Educational Research; 
Senate ise, Dynes kenga Tea 
e 7) versi oronto, 
Toronto, Canada. ay! 
D. Welty lahevere Professor of Réucssion, gleaned of 
Southern California, Los Angeles, California. 


Edward A. Lincoln, Consulting Psychologist, Halifax, 
Massachusetts. 


Irving Professor of Education, Executive Officer, 
Institute of Psychological Research, Teachers College, 
Columbia University, New York 27, New York. 


A. ead, Director Emeritus of Educational Research, 


R. M 
University of Florida, Gainesville, Florida. 


T. E. Newland, Fyatemee of Education, University of 
Dilinois, Urbana, [li 26 


Dembar pag 


C. W. Odell, Professor Emeritus of Education, University 
of Illinois, Urbana, Illinois. 


Willard C, Olson, Dean, School of Education, University 
of Michigan, Ann Arbor, Michigan. 


Valworth R. Plumb, Chairman, Division of Education and 
Psychology, University of Minnesota (Branch), Duluth, 
Minnesota. 


L. Pressey, Professor Emeritus of Educational Psy- 
“chol ogy, Ohio State University, Columbus, Ohio. 


William Reitz, Professor of Education Caiee of Educa- 
tion Examiner, Wayne University, Detroit 2, Michigan. 


D. Rinsland, Professor Emeritus of Education, 
34 University of Oklahoma, Norman, Oklahoma. 


Robert T. Rock, Jr., Professor of Psychology, Head of 
Dept. of Ps chology, Graduate School, Fordham Univer- 
s ity, ew k City. 


Phillip J. Rulon, Professor Emeritus of Education, — 


vara Graduate School of Education, Cambridge 
Massachusetts. 


John Schmid, Research and Evaluation Psychologist, Lack- 
land Air Force Base, San Antonio, Texas. 


Louis 


? Schmidt, Associate Professor of Education, 
aes ol of Education, Indiana University, Bloomington, 
jana. 


David Segel, Director, Lincoln Guidance Research Proj- 
ect, Public Schools, Albuquerque, New Mexico. 


Helen Thompson, Assistant Clinical Professor in Clinical 
Psychology, ew York Valens, Bellevue Medical 
Center, 120 East 75th Street, New York 21. 


Robert L. Thorndike, Professor of Education, Chairman ot 
Department of Psychological Foundations and Services, 
Teachers College, Columbia University, New York City. 


Herbert A. Toops, Professor of Psychology, Ohio State 
University, Columbus, Ohio. 


Maurice E. Troyer, Vice President, Japan International 
Christian University, Tokyo, Japan. 


Helen M. Walker, Professor of Education, Teachers Col- 
lege, Columbia University, New York City. 


Guy M. Wilson, Professor Emeritus of Education, Boston 


University, 33 Pine Street, Wellesley Hills, Massa- 
chusetts. 


Paul A. Witty, Professor of Education, Director of Psycho- 
Educational Clinic, School of Education, Northwestern 
University, Evanston, Illinois. 


Depot R. Wood, Dean, Rocky Mountain College, Billings, 
Montana. 


D. A. Worcester, 


Visiting Professor of Educati Univer- 
sity of Wisconsin, Madison, \ Wiens. vm . 


ht 1961 
ications, Inc. 





THE JOURNAL OF 
EXPERIMENTAL EDUCATION 


CU uth 





VOLUME XXX SEPTEMBER 1961 NUMBER 1 


Wednnennnann tnt 








WISCONSIN STUDIES OF THE MEASUREMENT AND 
PREDICTION OF TEACHER EFFECTIVENESS 


A Summary of Investigations 


by 


A. S. Barr 


Project Director, Professor of Education, University of Wisconsin 


D. A. Worcester 
Associate Director, Professor Emeritus of Educational Psychology and Measurement 
University of Nebraska, and Visiting Professor of Education, 
University of Wisconsin 


Allan Abell. 


Teaching and Research Assistant, Department of Education, University of Wisconsin 


Clarence Beecher 
Graduate Assistant, University of Wisconsin, Pan American Union Fellow 


Leland E. Jensen 
Assistant to the Director of Teacher Placement, University of Wisconsin 


Archie L. Peronto 
Supervisor, Student Teaching, University of Wisconsin 


Thomas A. Ringness 
Associate Professor of Education, University of Wisconsin 


John Schmid, Jr. 


Professor of Education, University of Arkansas 


DEMBAR PUBLICATIONS, INC. 


Madison 3, Wisconsin 








FOREWORD 


The preliminary work for the series of studies here summarized was done 
while the senior author was assistant director in charge of supervision, Detroit 
Public Schools, 1920-24. Two concepts arose from this experience: First, 
competent observers observing the same teacher simultaneously may disagree 
upon the quality of the teaching observed; and second, the subjective impres- 
sionistic vocabulary with which teaching is evaluated might be objectified by 
defining teaching in terms of observable teacher and pupil behaviors and con- 
ditions. Between 1924-30 a series of investigations was conducted to test these 
hypotheses. Three conclusions were drawn from these early studies: 


1. The fact that two or more observers observing the same teacher 
simultaneously may disagree in the quality of teaching observed 
was reaffirmed; 

. Good teachers cannot be separated from poor teachers in terms 
of specific teacher behaviors (there is an appropriateness aspect 
to teacher behaviors that must be taken into consideration); and 

. The evaluation of teaching can be objectified through the use of 
teacher and pupil behaviors and operational definitions of the per- 
sonal and professional prerequisites to teacher effectiveness. 


The subsequent years, 1940-60, have been employed in exploring ways and 
means of validating an objective approachto teacher evaluation. Many differ- 
ent approaches have been explored and many different data-gathering devices 
employed: measures of pupil growth and achievement; tests of qualities thought 
to be associated withteacher effectiveness and all sorts of rating scales; studies 
of teacher and pupil behaviors and interrelationships; andtests of basic knowl- 
edges, attitudes, and skills. 

Many persons have assisted with these studies. The seventy-five or more 
doctoral studies here summarizedrepresent many hours of careful work on the 
part of the persons making these studies. Participating in these studies were 
many teachers, supervisors, and superintendents who gave freely of their 
time and effort, andmany members of the staff of the Department of Education 
have served on advisory committees. The Graduate School and the School of 
Education of the University of Wisconsin have given almost continuous financial 
support to these studies. The summary here presented has been made possi- 
ble through financial assistance provided by the Graduate School of the Univer- 
sity of Wisconsin and the personal assistance provided by colleagues and others 
as indicated by the personnel herewith listed. To these and all others who 
have assisted with these investigations I give my sincere thanks. 


A. S. Barr 
April 17, 1961 








TABLE OF CONTENTS 


Nature of the Problem: A. S. Barr 

The Criterion of Teacher Effectiveness: A. S. Barr 

Methodology of the Investigations Here Summarized: A. S. Barr.. 
The Data-Gathering Devices Employed: Clarence Beecher........ 30 
The Use and Abuse of Correlational and Regression Techniques 

in the Evaluation and Prediction of Teacher Effectiveness: 

Allan Abeli 

Factor Analysis of the Teaching Complex: John Schmid 


A Non-Additive Approach to the Measure of Teacher Effective- 
ness: Leland E. Jensen 


Patterns of Effectiveness and Ineffectiveness in Teachers: 
Archie L. Peronto 


Personal Prerequisites to Teacher Effectiveness: A. S. Barr.... 


Motivation a Factor in Teacher Effectiveness: Thomas A. 
Ringness 


Some Assumptions, Explicitly and Implicitly Made, in the Inves- 
tigations Here Summarized: D. A. Worcester 


Teaching Ability and Its Correlates: A. S. Barr 


Studies Summarized 








JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, September 1961) 


CHAPTER I 


THE NATURE OF THE PROBLEM* 


A. S. BARR 


To select, recruit, educate, and assign 
teachers to particular teaching positions in an ac- 
ceptable manner, one must have more precise in- 
formation about the many meanings assoc iated 
with teaching, in general, and in particular situ- 
ations; and how to identify the personal, academ- 
ic and professional prerequisites to effective- 
ness. 

Part of the difficulty associated with the de- 
velopment of anadequate program for the meas- 
urement and prediction of teacher effectiveness 
arises from the facts that teaching means many 
different things, that the teaching act varies from 
person to person, and from situation to situation. 
Teachers teach different subjects and at differ- 
ent grade levels, they may not teach subjects but 
direct activities; besides classroom instruction 
they are presumed to be friends and counselors 
of students, members of a school community, 
and members of various local, state, and national 
associations of professional workers. With the 
situation as it is, the researchers interested in 
the measurement and prediction of teacher effec- 
tiveness have basically two choices: one, to seek 
the essence of teaching found within awide range 
of activities called teaching and the means of 
predicting efficiency ina variety of situations; or 
second, to measure efficiency in particular learn- 
ing and teaching situations and predict these par- 
ticular efficiencies. These particular situations 
may be carefully controlled situations as found in 
experimental research or the uncontrolled situ- 
ations of particular schools, classes, and school 
systems. The first problem is one of definition; 
one must define teaching before it can be evalu- 
ated and ‘effectiveness predicted. 

A further difficulty arises out of the fact that 
the concept of efficiency is nowhere well defined. 
Here as with the definition of teaching there are 
many different concepts of efficiency. The opin- 
ions are so varied among teacher educators, 
administrators, and teachers that each person 
can be said to have a more or less private sys- 
tem of evaluation all of his own. This is not a 
mere statement of opinion but a matter that has 
been amply substantiated by research. The 
amount of divergence that one would expect to find 








in a particular situation would depend, of course, 
upon the composition of the group making the 
evaluation, the extent towhich an attempt has been 
made to standardize criteria and provide training 
in their use and the particular teacher or aspect 
of teaching being evaluated. In uncontrolled sit- 
uations the judgments of a group of supervisors, 
administrators, and teacher educators, all ob- 
serving the sameteacher at the same time, under 
identical conditions, may vary so much that some 
observers may rate a particular teacher as 
among the very best that they have observed and 
others as among the very worst teachers that they 
have observed. Much that is important in pro- 
viding good schools depends upon the accuracy 
with which teachers are evaluated. 

There are many predictions made in the 
course of selecting and recruiting persons to be 
trained asteachers, in the education of teachers, 
and in the placement and employment of teachers. 
Many persons have failed to realize that so many 
predictions are made inthe procurement and man- 
agement of the teaching personnel. When admis- 
sion standards are set in teacher educating insti- 
tutions, in effect a prediction is made, namely, 
that persons meeting these standards do have the 
potential to be good teachers; when curricula are 
established for the education of teachers, a pre- 
diction is made, namely, that persons who pursue 
these curricula will become good teachers; when 
the superintendent employs an experienced or in- 
experienced teacher, he assumes, or at least 
hopes, that she will succeed. The employment 
score for some administrators is not very high 
in this respect. 

The problem under discussioninvolves many 
psychological considerations. First of all, at- 
tempts to measure and predict teacher efficiency 
make assumptions relative tothe nature of human 
abilities. As one examines the literature, it can 
readily be observed that human abilities have 
taken on many meanings. Sometimes they are 
thought of as physical processes and sometimes 
as symbolical processes. Sometimes they are 
referred to as psycho-physical entities and some- 
times as operations. They are sometimes con- 
sidered as something very specific such as the 


*This summary has been made possible througha grant from the Graduate School, University of Wisconsin. 





JOURNAL OF EXPERIMENTAL EDUCATION 


ability to read the temperature from a very sen- 
sitive thermometer or as something very general 
such as the ability to do certain types of mathe- 
matical calculations. Sometimesthey are spoken 
of ineven more generalterms as when one speaks 
of academic intelligence, mechanical intelligence, 
or social intelligence. Sometimes abilities are 
referred to as primary and sometimes as secon- 
dary; sometimes as deep and underlying opera- 
tions and sometimes as easily observed surface 
or superficial events. We shall look upon all 
such use of the term as matters of convenience 
but shall at all times attempt to keep the reader 
informed about the meaning intended. 

The nature of human abilities cannot be ade- 
quately defined without some reference to the 
mind-body relationship. We shall assume that 
mind and body are one and discuss teaching effec- 
tiveness, chiefly, interms 1) of operations, pro- 
cesses, and behaviors, 2) of conditions, internal 
and external, considered essential to easy, 
smooth, and efficient operations, and 3) of end 
products, outcomes and results, that follow from 
operations. When words like intelligence, tact, 
forcefulness, forthrightness or cooperation are 
used, they will, in general, be used to describe 
behavior rather than innate qualities or traits. 
The conditions, both external and internal, under 
which teaching takes place will be treated as far 
as possible on afactual basis and in a manner 
such as to include both psycho- physical processes 
and socio-physical environment factors and their 
many interrelationships 

The above very general statement may be 
rendered somewhat more meaningful by a further 
discussion of certain aspects of the problem. 
Consider, for example, the personal internal con- 
dition essential to efficient operations so gener- 
ally lost sight of inteacher efficiency studies. 
Proper physiological functioning, including sen- 
sory processes, neurological processes, circu- 
latory processes, digestive processes, excretory 
processes, and glandular processes, is assumed 
to constitute a basic source of conditions that 
limit and facilitate efficiency inteaching. The 
psychologists are not ordinarily directly con- 
cerned withthese but they are nonetheless impor- 
tant and need much more consideration than they 
ordinarily receive. They constitute, doubtlessly, 
the most important single source of causes of effi- 
ciency and inefficiency inteaching. These proces- 
ses underlie many ofthe causes of effective and in- 
effective teaching frequently discussed at a more 
superficial level by psychologists and in a manner 
that may be quite abstract and ambiguous. 

In this connection, a point should be made of 
the fact that many persons interested ineffective 
classroom instruction will not be content to 
merely study behavior. Much can be learned 
from the study of behavior and as a psychological 
operation it certainly is a vast improvement over 





the study of the stream of consciousness which 
characterized psychology at the turn of the cen- 
tury. Many will desire, however, to know how 
behaviors come to be, their concomitants, and 
antecedents. Much of the current theorizing about 
behavior and other psychological uncertainties 
arises out of the unwillingness of many psychol- 
ogists to look beyond and back of events. The 
point here is that muchcanbe learned about teach- 
er efficiency from the study of psycho-physical 
processes. 

Another critical point inthe study of human 
abilities will be found in the sensory processes 
and perception. Much attention was given to the 
testing of the sensory processes at the turn of the 
century in attempts to measure intelligence but 
not with too much success. Those studying the 
matter didfindout, however, that individuals dif- 
fered, and laid the basis for further investigation. 
As the matter has been pursued further, and with 
better investigational procedures, much new in- 
formation hasbeen had. Among the broader gen- 
eralizations that have arisen from this study is 
the concept of early learning. In a certain sense, 
sensory perceptions are learned and much that 
wili condition human behavior and efficiency will 
be learned in the first few years after birth and 
during the pre-school period. The nature and 
functicning of perception needs careful study as it 
relates to teacher efficiency. In a more complex 
and more highly generalized level, fruitful re- 
search has already been conducted with reference 
to such matters as social perception, self percep- 
tion, and the teacher’s perception of all sorts of 
teaching situations. This would appear to be 
closely related to any basic attack upon the meas- 
urement and prediction of teacher efficiency. 

Another critical point in human behavior and 
efficiency will be found in the ties that have been 
established between physical and symbolic manip- 
ulation in each individual teacher’s way of thinking 
and behavior. The building of these connections 
comes chiefly in preschool and early elementary 
school education. This being as it is, the matter 
has been generally looked upon as water over the 
dam and not amenable to further study and im- 
provement. But, presumably, the connection be- 
tween the physical world and the symbolic world 
is faulty or inadequate in many instances. Many 
people live in a world of symbols that have little 
connection with the realities of life. As a matter 
of fact, muchof the discussion of teaching is quite 
remote. Possibly a study of just how teachers 
perceive the many things associated with teach- 
ing, the concepts that they hold, and their ability 
to manipulate verbal symbols might provide new 
insights into teacher efficiency. Teaching involves 
an immense amount of verbalization. 

In the immediately preceding discussion, at- 
tention has been focused upon the antecedents of 
teacher efficiency. The researcher who purports 





to assess human abilities must be equipped not 
merely with lists of the concomitants of ability 
but he must have ideas about how these are cate- 
gorized andstructured. Oneof the theories about 
structure is that ability can be thought of as 
some general power (Spearman called this gen- 
eral factor g) plus a host of special capabilities. 
Other theories emphasize the importance of group 
factors. Thurstone, for example, stresses the 
importance of what he called the primary mental 
abilities. The manner in which he structured 
these abilities, usefulas it seemed to be for cer- 
tain purposes, may or may not be helpful for the 
evaluation and the prediction ofteacher effective- 
ness. Much has also been said about the dimen- 
sions of abilities. Thorndike, for example, em- 
phasized the dimensions of breadth and depth. 
Guilford has pursued this matter intensely for 
many years and has suggested a much more elab- 
orate structure. It would seem that the re- 
searcher in this area must give some consider- 
ation to structure. 

More use should be made of psychological 
theory. It is an important source of ideas about 
teaching ability and competency, butthe other side 
of the coin is equally important, that is what is 
done must be workable, practical, or useful 
in a situation composed of many factors 
other than psychological theory. Besides the 
skepticism that many practitioners have of psy- 
chologists and their theories, there is the need 
that theory be set forthwith more objectivity, and 
ina manner that is more meaningful to practi- 
tioners. For one thing, theory must be simpli- 
fied; this usually comes with moreinsight. What 
is frequently lost sight of is that theory by its 
very nature tends to emphasize the generalizable 
and to neglect the specifics. Practitioners are 
seldom allowed to forget specifics. The re- 
searchers involved in the investigations here sum- 
marized would appear to be generally familiar 
with the theories of the structure of human abil- 
ities but their specific assumptions in this re- 
spect are frequently inadequately set forth. 

Even when one is clear about the various 
ways of categorizing and structuring human abil- 
ities, different vocabularies may be employed in 
talking about the matter. This would appear to 
be particularly true of the discussions of the spe- 
cial talents presumed to be prerequisites to dif- 
ferent teaching assignments, wherein teachers 
teach many different things under a variety of 
conditions to pupils with different capacities. It 
is not generally assumed that the special talents 
required of teachers are possessed in equal 
amounts by any particular teacher or that they 
are highly intercorrelated. This latter point be- 
comes exceedingly important in correlation stud- 
ies where traditionally low correlations are gen- 
erally assumed to be undesirable. At various 
places inthe discussion to follow, a point will be 





made of the fact that correlations cannot be taken 
at face value. At this point it is emphasized that 
the talents of individuals are highly diversified 
and that it is doubtlessly psychologically unsound 
to expect many high level talents in any particu- 
lar individual or that they will be highly intercor- 
related. It is probably sound to assume that there 
is a ‘considerable number of one-talent teachers. 
There is a lesser number of two-talent teachers; 
and very tew many-talented teachers. In light of 
current theories of the structure of human abili- 
ties, itis probably best to hypothesize that teach- 
ing ability is composed of some particular com- 
binations of special and general abilities. Much 
of the research relative tothe evaluation and pre- 
diction of teacher effectiveness does not seem to 
reflect this point of view to any considerable ex- 
tent. 

Besides the foregoing problems there are the 
discrepancies between potential and observed ef- 


ficiency that one must expect. For many reasons 


many individuals do not live up to expectancy. 
These discrepancies may arise from many rea- 
sons: lack ofinterest in pupils, teaching, and the 
subjects taught; lack of physical energy, deter- 
mination and drive; lack of adaptability, flexibil- 
ity, or the ability to adjust to different needs, 
persons, and situations; personality conflicts, 
rigid value systems, and attitudes unacceptable 
to majority groups: teachers, parents, pupils, 
administrators, supervisors and all others with 
whom the teacher may come in contact. These 
deal chiefly withthe feeling components of behav- 
ior andare part ofthe potential except that meas- 
ures of potential have dealt chiefly with its cog- 
nitive aspects. Since, for various reasons, 
people do not live up to expectations, those who 
would predict efficiency must include in their 
machinery of prediction some of the non-static 
aspects of behavior. 

Possibly some thought should also be given 
to the nature of measurement. Measurement is 
a device for eliciting behaviors under standard 
conditions. Mostofthe behaviors studied through 
the so-called standardizedtest are verbal behav- 
iors and while teaching may be studied by means 
of direct observation, measurement is frequently 
nothing more or lessthan the counting of the num- 
ber of verbal acts correctly performed under a 
given set of conditions. Very great attention has 
been given to the quantification of these verbal 
exercises andthe statistical treatment of data se- 
cured through them. All statistical calculations 
involve, however, assumptions and these assump- 
tions need most systematic study as they relate 
to the measurement and prediction of teacher ef- 
ficiency. Whether these statistical-verbal oper- 
ations are profitable or not for measuring and 
predicting teacher efficiency needs further con- 
sideration and more study, notwithstanding the 
very great amount of time and effort already de- 





JOURNAL OF EXPERIMENTAL EDUCATION 


voted to it. 

Finally, a problem that must be of continued 
concern to those interested inthe measurement 
and prediction of teacher effectiveness is that of 
an adequate criterion. By and large, and with 
many exceptions, two criteria have been used: 
1) efficiency ratings of one sort or another, and 
2) measured pupil gains. Both present real dif- 
ficulties. Over all, general ratings of teacher 
effectiveness have been shown to be, under cur- 
rent conditions, exceedingly unreliable. Depend- 
ing upon one’s point of view, this unreliability 
provides a substantial road block or a challenge 
to the researcher interested in this area of re- 
search. Afurther difficulty arises out of the fact 
that where a number of subjects are employed, 
as is usually the case, drawn from a num ber of 
different school systems, or even from the same 
school system, the teachers rated are generally 
not rated upon the same performance. That is 
they do different things. While the differences in 
what teachers do are not ordinarily as extreme 
as juggling acts and memorizing non-sense syl- 
lables, they are nonetheless different. Thus, to 
use efficiency ratings, even with well developed 
scales and highly trained raters, one must as- 
sume a large amount of transfer from one set of 
tasks to another (tasks which may be called teach- 
ing), to measure and predict teacher effective- 
ness. 

The use of measured pupil gain as a criter- 
ion of teacher effectiveness also presents very 
real difficulties. First of all, each teacher in 
the modern school, within very broad limits, 
chooses his own purposes, means, and methods 
of instruction. These ordinarily vary from one 
school system to another and within named grade 
levels and subject fields. Regardless of the val- 
idating data reported intest manuals, the tests 
used in developing the pupil gain criterion will 
have varying degrees of operational validity, ex- 
cept as the teachers agree to pursue certain 
stated objectives whichcan be defined with suffi- 
cient clarity to provide like meanings to all the 
participants. A second difficulty arises out of 
the fact that, notwithstanding over a half century 
of effort, many of the outcomes of learning and 
of teaching are poorly or inadequately measured. 
The gaps in the criterion arising from inadequate 
tests with which to measure pupil gain will be 
found to beconsiderable. Finally, tests measure 
effects but not causes. The sources of the effects 
observed are not readily ascertained, even under 
carefully controlled experimental conditions. 
Some of these effects will reside in the pupils, 
some in their general and special capabilities, 
some in their previoustraining, andsome in mo- 
tivation. A few of the effects are doubtlessly 
traceable to the home environment: socio-eco- 
nomic status, respect for school education, and 
direct assistance rendered by various members 





of the family. A few of the effects will be trace- 
able to the school and community: Insome, teach- 
ers’ and pupils’ morale is high and in some it is 
low. The physical facilities of different schools 
and communities vary greatly. And finally, there 
are the direct and indirect effects of the teaching 
of other teachers, both inthe same and related 
subjects. One of the very best measures of a 
teacher’s effectiveness will be found in what his 
students do in subsequent course work. Accord- 
ingly, the problem of establishing an adequate 
criterion of pupil gain will not be an easy one. 

The purposes of this monograph are: a) to 
present a critical overview of some seventy-five 
doctoral studies, all made at the University of 
Wisconsin, that pertain in some respect to the 
measurement and predictionof teacher effective- 
ness, and b) to offer such general observations and 
new hypotheses as appear to be supported by the 
data containedtherein as may be of value in a fur- 
ther study of teacher effectiveness. 

As one views these seventy-five studies as a 
whole, they seem to have certain broad charac- 
teristics. First of all, the studies are almost 
without exception descriptive in character, and 
exploratory. The statistical devices used were 
principally those of means, standard deviations, 
the coefficient of correlation, prediction equa- 
tions, and factor analysis, all of the descriptive 
type of research. Experimental research and sam- 
pling are generally not employed. Some would 
contend that this is a great lack, and in a certain 
respect itis. To others, it might appear more 
appropriate to reserve these more refined or ad- 
vanced types of research for later research when 
promising leads have been found and the danger 
of confounding the problem through needless com- 
plexity has passed. The principal purpose of the 
investigations reported here has been to get some 
preliminary ideas about the nature of teacher ef- 
fectiveness and how it might be evaluated and pre- 
dicted. Whenever possible, important studies 
have been duplicated by a second, and sometimes 
a third parallel study to secure information about 
the consistency of the findings. In several in- 
stances, follow-up studies were made after inter- 
vals offiveandtenyears. In some instances, the 
case study method has been employed to supple- 
ment statistical techniques. Taken as a whole, 
these many studies constitute a valuable source 
of ideas about what to do and what not to do in 
teacher personnel research. 

Avery large number of data-gathering devices 
of many kinds were employed in these investiga- 
tions, some constructed for the particular inves- 
tigation and some drawn fromthe literature. Most 
of the more promising data-gathering devices re- 
ported in the literature for over a periodof more 
than thirty years have been tested in a variety of 
situations and under somewhat comparable condi- 
tions. The data-gathering devices consisted of 





various sorts of observational techniques, rating 
scales, questionnaires, inventories, and tests 
administered to both teachers and pupils. While 
the instruments were largely of a verbal nature, 
there were devices of a mechanical sort that meas- 
ured various aspects of persons. Almost every 
conceivable aspect of teacher ability, general and 
special, and evidence of efficiency was explored, 
including interests, attitudes, behaviors, knowl- 
edges, skills, and personality traits. Informa- 
tion was sought from superintendents, principals, 
supervisors, other teachers, pupils and the 
teachers themselves. One of the purposes of the 
analysis is toobtain new information and insights 
about the validity and predictive efficiency of the 
data-gathering devices employed in the study of 
teaching. 

Finally, some assumptions appear to have 
been made in these studies about the nature of 
teaching. While there is data relative to most 
aspects of teaching andthe many special abilities 
associated with them, in general, the several 
studies seem to relate to those aspects of teach- 
ing considered common to the activities and abil- 
ities of many teachers. While teachers perform 
many activities not directly related to pupils as 
a whole, these studies seem to be concentrated 
chiefly upon the direct classroom activities of 
teachers andthose related directly to teacher- 
pupil relationships. 

As one looks through the several investiga- 
tions one finds various terms used to designate 
or describe the successful teacher. Frequently 
the word ‘efficiency’ is used. One will note, too, 
that the term is sometimes appliedto the teacher 
as inteacher efficiency and sometimes to the 
teacher’s behavior as in teaching efficiency. 
While the term efficiency’ is not without limita- 
tions, possibly there might be some gain if this 
term were applied tothe teacher’s behavior rather 





than to the teacher as a person. Such a proced- 
ure would be in keeping with an earlier position 
taken and stated in connection with the discussion 
of psychological problems. Possibly if the term 
were so used one might add somewhat to the ob- 
jectivity of discussions ofteacher-teaching effec- 
tiveness. Eventheteacher herself might look 
upon behaviors somewhat more objectively than 
traits of character or personality. This, of 
course, is merely a hypothesis and will need 
careful consideration and data if such can be found. 

Remembering that the primary purpose of 
this report is to present a critical overview of 
certain studies pertaining tothe measurement and 
prediction of teacher effectiveness, done at the 
University of Wisconsin, answers will be sought 
to such questions as: 


. What is the nature of the problem under in- 
vestigation? 

. What was the methodology of these studies? 

. What are the criteria of teacher effectiveness ? 
How adequate were the statistical techniques ? 
Are therealternative techniques that give 
better results ? 

. Are there other theories of the nature and 
structure of human abilities that might give 
better results? 

Are there discrepancies between teacher po- 
tential and performance that should be given 
more attention? 

. What are the personal and professional pre- 
requisites to teacher effectiveness ? 

. Can good teachers be distinguished from poor 
teachers on the basis of the data supplied in 
these studies? 

. What assumptions are made by these studies? 

. What overall conclusion might one draw from 
and about these studies ? 





JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, September 1961) 


CHAPTER Il 


THE CRITERIA EMPLOYED IN THESE INVESTIGATIONS 


A. S. BARR 


In general, with some exceptions, the crite- 
ria of teacher effectiveness employed in the in- 
vestigations here summarized are global inchar- 
acter and of two sorts, namely: a) efficiency rat- 
ings and b) pupil gains as measured by tests 
administered to the pupils before andafter in- 
struction. 

Before examining however, the criteria em- 
ployed in these investigations, may we consider 
first certain theoretical aspects of criterion de- 
velopment. It should probably be first recalled 
that the nature of the criterion will vary with the 
prediction to be validated. One may, for example, 
wish to predict the ability of students to complete 

.a particular four year undergraduate program for 
the education of elementary school teachers, or 
to complete the program for the education of sec- 
ondary school teachers of science, or to predict 
success in any or all of the positions for which 
teachers prepare within some broad area of spe- 
cialization; or one may desire to predict on the 
job effectiveness of teachers in particular teach- 
ing assignments and situations or topredict some 
highly generalized sort of teacher effectiveness 
that may encompass a very great variety of posi- 
tions and situations. The interests and goals of 
researchers are different and these are ordinarily 
respected when the investigators intent is ade- 
quately verbalized. 

A point sometimes forgotten is that predic- 
tion is made on some sort of time sequence, i.e. 
it starts some place and is presumed to endsome 
place. One might start, for example, to make 
predictions in the senior year in high school, in 
the junior or senior division of college, or after 
a year or two of teaching as for probationary 
teachers. The immediate terminal concern ofthe 
investigator may be with any later point in the se- 
quence: graduation from college, initial on-the-job 
effectiveness, say, after one or two years of teach- 
ing, or with whatever later period with which the 
investigators concern may be. This time se- 
quence has import for the nature of the criterion. 
Sometimes ones concern is merely with achieve- 
ment, academic or otherwise, ata particular 
point in the sequence; sometimes ones concernis 
more with potential thanwith current effective- 
ness; or it may be with both. As one passes from 
predictions about in-college-s uccess to predic- 
tions about on-the-job-success the nature of the 
criterion employed may change, and the relative 
emphasis upon potential and aequired effectiveness 





may shift. Even with on-the-job evaluations 
the administrator is in varying degrees interested 
in the teachers potential, as wellas current effec- 
tiveness. 

Keeping in mind these two concerns, namely 
that which refers to potential and that whichrefers 
to already achieved competency, as they relate to 
the criterion, one may wish to recall the sorts of 
criteria commonly employed. 

There are three commonly employed criteria 
encompassing four approaches to evaluation. The 
criteria most commonly employed are: 

1) Efficiency ratings, which may be made by 
any number of persons, but mostfreguently by the 
superintendent of schools or other members of his 
staff; 

2) Measures of pupil growth and achievement 
usually adjusted for differences in intelligence and 
other factors thought to influence growth and a- 
chievement; and 

3) A preservice graduationcriterioncom- 
posed of a) measures ofthe foundations of ef- 
ficiency: basic knowledges, skills, and attitudes, 
and b) the personal prerequisites to effectiveness; 
and professional competencies asinferredfrom 
observation of performance in practice teaching, 
internships, and other activities involving children. 
Embodied within these criteria are four approaches 
to teacher evaluation, combined in different ways 
by different persons, institutions, and data gather- 
ing devices, namely, a) evaluations made interms 
of the qualities of the person, as inpersonality rat- 
ing; b) evaluations which proceed from studies of 
teacher behaviors, as in the rating of performance 
in terms of inferred personal qualities or desira- 
ble professional characteristics; c) evaluations de- 
veloped from data collected relative to presumed 
prerequisites to teacher effectiveness, potential or 
already achieved, represented by such psycholog- 
ical constructs, as knowledges,skills and attitudes; 
and d) evaluations developed from studies of the 
product, as for example, pupil growthand achieve- 
ment. 

Each of these several approaches to the devel- 
opment of criteria has advantages and disad- 
vantages, strengths and weaknesses, assets and 
liability. We would like to consider carefully now 
the problems associated with each, beginning first 
with the qualities of the person or traits approach. 
First of all, it should probably beobserved that 
qualities suchas considerateness, cooperativeness, 
ethicality and the like are not directly observable 





but inferences drawn from data. These data may 
be of many sorts arising from the observation of 
behavior, interviews, questionnaires, inventories 
or tests. Whatever the source of information, 
judgments about the qualities are inferences and 
subject to all the limitations, associated with in- 
ference making including the accuracy of the orig- 
inal data upon which the inferences are based, and 
the processes of inference making. Beyond this 
there is a most difficult problem in semantics a- 
rising out of the problem of attaching common 
meaning to the terms employed. Many persons 
have discussed this matter. Some have attempted 
to overcome this difficulty by resorting to behav- 
ioral or operational definitions. Let us look atthe 
matter more closely. 

First, let us recall that there are many the - 
ories of personality. As the word personality it- 
self suggests, most of these theories relate to 
characteristics of the person; thus we speak of 
personality traits. Another way of looking at the 
subject, however, is to consider not qualities of 
the person but characteristics of performance or 
behavior. Whichever approach is used, there is 
trouble ahead, since almost any approach has lim- 
itations and assets. Many persons wouldprefer a 
behavioristic approach, and what follows is dis- 
cussed in these terms. 

In pursuing a behavioral approach, we must 
recognize, however, that we immediately remove 
from consideration such aspects of personality as 
height, weight, age, complexion, bodily propor- 
tion, and physique, unless these static aspects of 
personality can be translated into behaviorial e- 
quivalents. Even though suchstatic qualities may 
present difficulties to those who would interpret 
personality in behavioral terms, a behavioristic 
approach still has some advantages in thatsuchan 
approach makes it possible to integrate the con- 
cept of personality with that of methods, whichhas 
always been considered an important aspect of 
teacher effectiveness. Inasense, method, broad- 
ly conceived, encompasses all teacher behavior 
and thus personality. 

If we approach the problem inthis way our 
concern Stated in question form becomes: Cande- 
scriptions of behavior provided by such terms as 
considerateness, co-operativeness, e x pressive- 
ness, objectivity, ethicality, provideushelpful 
ways of considering teacher effectiveness? Pos- 
sibly, perhaps, this particular approach may 
prove to be too remote to have any great practical 
value and other descriptive terms would be better; 
this also needs careful consideration. 

Psychology has defined the conditions for ef- 
fective learning in terms of certain principles of 
learning. This suggests the question: Are the 
techniques of teaching that are presumed to grow 
out of learning theory encompassed by the person- 
al qualities associated with personality traits, or 
are the behaviors found inthe techniques of 











BARR 11 


teaching something different? Possibly they con- 
stitute another constellation of values that one needs 
to keep in mind. Possibly there is some unique 
constellation of human behaviors that constitute the 
essence of personality, and another constellation 
of activities that constitute technical competency. 
These are important considerationsincriterion 
development. 

If we attempt to characterize behavior in such 
broad terms as those suggested by personality 
traits, we must give some attention to choosing and 
defining the personal qualities that appear to be 
pertinent to teacher effectiveness. Many words in 
a standard collegiate dictionary purport to describe 
personal characteristics. How does one choose 
from these? The literature gives one the impres- 
sion that the choice of vocabulary has been based 
pretty much on perfonsl preference. 

Some years ago, the writer served on a pro- 
fessional jury that attempted to prepareashort list 
of descriptive terms to be used in teacher evalua- 
tion. More recently the author compiled a list of 
terms used to describe teaching effectiveness in 
studies on the measurement and the prediction of 
teacher efficiency. In Some manner we need to 
develop a list of the aspects of the personor of the 
behavior that need to be considered. Once a list is 
agreed on, each term in the list must be defined. 

Before turning to problems of definition, there 
are Several questions one might ask about the list 
itself. How long should it be? How much overlap 
can one expect or tolerate? Are the different qual- 
ities to be looked upon as supplementary or com- 
plementary, or may they be conflicting? Are there 
hierarchies, patterns, or sequences of behaviors, 
or of qualities that we should consider? If the qual- 
ities are behavioral defined can some aspects of 
behavior be thought of as superficial, unimportant, 
or trivial and others as basic, highly potent, and 
primary? Should the terms used to describe teach- 
er behavior reflect some particular philosophy of 
education, theory of learning, or concept of human 
relationship? 

To provide acceptable working definitions of 
the descriptive terms to be used insuch an ap- 
proach is extremely difficult. Presumably, if one 
makes a behavioristic approach all definitions will 
be operational definitions. From this point of view 
what does a reliable, emotionally stable, and re- 
sourceful teacher do? If judgments about teachers 
are to be based on observations of teachers’ beha- 
viors, how do we know what to look for and what to 
ignore? Over and above the counting of behaviors, 
if that is our concern, there is also the matter of 
pertinency. Whether a behavior, or aspect of be- 
havior, is pertinent to some particular quality de- 
pends on how the quality is defined. If the list of 
terms is highly condensed, many subtle shades of 
meanings will probably need to be considered. 

Having listed and defined the terms tobe used, 
one may then turn to the collection of the data. 





12 JOURNAL OF EXPERIMENTAL EDUCATION 


There are many different sorts of data gathering 
devices. By what means may observers reach 
sound judgments about whether the data collected, 
or to be collected, may be considered as evidence 
of the presence or the absence of some particular 
quality? Ifone is observing behaviors, can behav- 
iors be consicered out of context? What aspects 
of the context should one consider? Is the noting 
of the presence and absence of behaviors suf- 
ficient? What dimensions of behavior should one 
consider? Is counting enough, or should the be- 
haviors be evaluated according to some scale of 
values? Should one consider isolated behaviors 
or patterns of behaviors? Should the impact of 
particular behaviors as contrasted with mere 
presence or absence of behaviors be considered? 
Their Extent? Duration? or Intensity? How will 
the score, if there is a score, be expressed? 
How may one summarize the data? 

When one attempts to reach some overall 
judgment about teacher effectiveness from judg- 
ments about the separate aspects of personality, 
one is confronted with a further necessity. It is 
common practice to give each item a numerical 
value. These items are then added and divided 
by their number to calculate some sort of aver- 
age. But can we safely assume that these values 
can be added? Is an average an adequate repre- 
sentation of the data from which inferences may 
be drawn? Do some aspects of behavior have 
Spe cial potencies in and of themselves? Are 
there upper and lower cutoff points inthe amounts 
of these values that an effectiveteacher must 
possess? May a teacher’s overallefficiency rest 
on the presence or absence of some particular 
quality of behavior instead of an average of many 
values? Do various combinations of behaviors or 
qualities have particular significance? 

In spite of all of the problems that come out 
of attempts to use qualities of the person as a 
criterion of effectiveness there are many reasons 
for giving it primary consideration. First, most 
of the efficiency ratings employed in evaluating 
teacher effectiveness, almost without exception, 
include qualities in one form or another, - if not 
solely, at least in part. Secondly, behaviors, in 
and of themselves, are not the critical elements 
of efficiency. There are many alternatives from 
which one may choose in a particular situation. 
As a matter of fact there are ordinarily many al- 
ternatives for each possible behavior, one being 
about as effective as another. Thequality ap- 
proach provides a means of getting beyond the in- 
cidental to the critical. Finally, many persons 
consider personalfitness animportant considera- 
tion in and of itself, completely aside from its ef- 
fects, at least aside from its primary or direct 
effects. In this sense personal fitness is an im- 
portant criterion. 

The criterion of teacher effectiveness may 
also be behaviorally defined, directly, and without 





the summarizing operations provided by person- 
ality traits. Behaviors have already been in- 
volved in the discussion of quality oriented c ri- 
teria, but there they were employed largely as 
building stones out of which judgments about the 
personal prerequisites to teacher effectiveness 
were constructed andas a means of summarizing 
and defining the personal prerequisites of teach- 
er effectiveness and not as criteria inand of 
themselves. But behaviors may themselves be- 
come criteria. A very attractive feature of a be- 
havioral criterion is that behaviors may be 
directly observed by all who care to look. With 
more adequate instrumentation observability can 
be expected of internal behaviors as well as ex- 
ternal behaviors. It seems quite reasonable to 
expect, with the development of appropriate tech- 
niques, to attain a very high degree of objectivity. 
This, of course, is a highly desirable feature of 
a criterion of teacher effectiveness. But here too, 
as withother criteria, one runs into difficulties. 
The problems that one encounters inthis approach 
are, however, of a different order. To many per- 
sons behaviors are merely the surface aspect of 
life and they would look beyond these to what they 
would call the essence ofthings. They would look, 
for example, for the primary mental prerequi- 
sites to teacher effectiveness or to qualities of the 
person. There are also those that would argue 
that a behavioral criterion is too cumbersome. 
There are just too many behaviors to make this 
approach practical. One purpose of science is to 
achieve a degree of simplicity and control in an 
exceedingly complex world. From this point of 
view there are just too many behaviors to deal 
with them individually. Thus some psychologists 
attempt to categorize, generalize and structure 
behaviors in some fashion to secure control over 
them. 

Another difficulty into which those who have 
attempted to build behaviorally oriented criteria 
have run is that of alternatives. Withhighly sim- 
plified situations as in a particular highly simpli- 
fied learning maze there may be few alternatives; 
but in most classroom situations there are many 
alternatives; the restrictions generally speaking 
are only those ‘ound in the minds of man. Pos- 
sibly more ccuid and should be done with catego- 
rizing and generalizing these alternatives; the 
quality approach already discussed is one means 
of doing this; there are doubtless other ways of 
doing this. 

Another problem that has arisen to plague 
those who would construct behaviorcriteria is 
the problem of context or appropriateness. Ex- 
cept in highly simplified functions and models, 
acts (i.e. specific behaviors) cannot be said to be 
good or bad in and of themselves. They are good 
Gr bad; effective or ineffective; appropriate or in- 
appropriate when considered in relation to 
purposes, persons and situations. To consider 





their worthwhileness out of context is much like at- 
tempting to consider the question: Which is bet- 
ter, a hand saw or a screw driver? Well, the an- 
Swer is: it depends upon the purpose, the person 
involved, and the immediate situation. We shall 
examine this problem further when we examine 
the criteria employed in the investigations here 
under consideration. 

Another approach tocriterion development is 
that which employed such psychological concepts 
as the knowledges, skills, and attitudes in terms 
of which the purposes of education have been so 
generally expressed. Inthis approach the empha- 
Sis would appear to be upon the mental prereq- 
uisites to teacher effectiveness as contrasted with 
the bio-physical prerequisites. If one assumes 
that the teacher bring his whole self to teaching, 
this is not the whole self but the mentalself. Even 
so, the concept is an important one. These men- 
tal prerequisites are presumed to act as controls 
over behavior and thus control teacher effective- 
ness. If we think of qualities asa convenient way 
of summarizing andcharacterizing behavior, 
knowledges, skills, and attitudes are the moni- 
tors of behavior; and as suchthey become anim- 
portant part of the machinery for assessing effec- 
tiveness except possibly for the subjects and ac- 
tivities heavily loaded with psychomotor content. 

The criterion here discussed is the one or- 
dinarily applied upon the completion of the certi- 
fication program or later when certification may 
be by examination. It aims chiefly at potential in 
that it may be applied prior to employment. 
Properly applied it has substantial value when ap- 
plied singly or in conjunction with other criteria. 

The problems associated with this criterion 
are those associated with testing everywhere. 
The basic one is that of validity but associated 
with it are all the problems that one may encounter 
in test construction in any field. Presumably one 
starts with a set of well defined and carefully vai- 
idated objectives. The objectives are the objec- 
tives of the teacher education program. Some 
sort of a curriculum is assumed; and it is assumed 
that this curriculum has some established rela- 
tionship to the purposes of teacher education 
as ong might develop through the processes of cur- 
riculum making. The testing is most frequently 
made in terms of the knowledges, skills, and at- 
titudes sought in the teacher education curric - 
ulum. A very real value that may be had from 
this approach is to keep those engaged incriterion 
development reminded of the coverage to be ex- 
pected by the concept of teaching. Inthis respect, 
as important as the directionof learning inthe 
classroom Situation is, it is only one of the many 
functions performed by teachers. Teachers are 
also supposed to be friends and counselors of pu- 
pils; teachers today are asked to participate in 
many extra-curricular activities; and teachers 
engage in many activities associated with their 





13 


profession. If pupil growth and achievement is 
looked upon as the ultimate criterion of teacher 
effectiveness, the mental attributes here pre- 
sumed to be prerequisites to teacher efficiency 
constitute an immediate or proximate criterion of 
effectiveness. 

A fourth type of criterion of teacher effective - 
ness is that of pupil growth and achievement, 
which is usually expressed as_ pupil gain scores 
based upon achievement tests administered prior 
to instruction and again at some subsequent date 
when a particular unit of instruction or course has 
been completed. To most persons this criterion 
is considered a primary criterion against which 
all other criteria should be validated. Like the 
other criteria it is subject, however, tovery def- 
inite limitations. 

First of all the criterion of pupil growth and 
achievement is built about the classroom activi- 
ties of the teacher. Directing the learning of pu- 
pils, as has already been said, may beaprimary 
responsibility of the teacher but not her sole re- 
sponsibility. Besides directing the learning of 
pupils, teachers in most school systems are sup- 
posed to be friends and counselors of pupils, they 
are expected to be faculty advisors to or directors 
of many extra-curricular activities, they partici- 
pate in educational policy making and advise the 
administration as individuals and as groups through 
committee action, they must cooperate and work 
with their co-workers in the discharge of many 
responsibilities, they are called upon to partici- 
pate in many community activities and they are 
members of various professional groups. Com- 
petency in these extra-classroom activities weigh 
heavily in the worth that most administrators at- 
tach to the teacher. 

Another difficulty arises in fitting the gains 
tested to the instructional goals of the teacher. In 
global survey testing the problem is that of chos- 
ing tests that faithfully represent the generally 
accepted purposes of educationand those of teach- 
ers in specific learning-teaching sitvations. The 
tests, and other measuring instruments, may be 
valid and reliable in some highly generalized sit- 
uation but may not be in keeping with the demands 
of the particular situation or in keeping with the 
teachers purpose. Workers in this area have at- 
tempted to overcome this difficulty ina number of 
ways: a) Some have sought tosecure more u- 
niformity by selecting pupils with common needs 
and teachers with common purposes; b) Some have 
given up global testing and have substituted for it 
custom-made testing where each testing program 
is individualized and tailored to the needs of par- 
ticular learning-teaching situations; and c) Yet 
others have attempted to randomize the ef- 
fects and variables by different sorts of statistical 
devices. The difficulty is not easily resolved. 

Yet another difficulty arises out of securing 
valid and reliable measures of the accepted 





14 JOURNAL OF EXPERIMENTAL EDUCATION 


purposes of education. While much progress has 
been made in developing valid and reliable tests, 
few would hold that adequate tests have been de- 
veloped for the major purposes of school educa- 
tion. Substantial progress has been made in the 
realm of direct recall, recognition and informa- 
tion tests but less progress has been made in 
measuring the products of the teachers efforts in 
such areas as problem solving, personality de- 
velopment, mental health, aesthetic learning and 
emotional growth. Generally speaking the tests 
ordinarily used in developing a pupil gain crite- 
rion are quite limited in this respect. 

It should, also, probably be recalled that tests 
measure results but provided little information 
about how these effects were produced. The 
teacher effect is only one of many effects that 
produce changes in pupil growth and achievement. 
One of the real difficulties that arises in the uti- 
lization of this criterion is that of isolating the 
teacher effect. There are many other effects: a} 
a really good prior teacher or current colleague 
may produce measurable changes inall subsequent 
course work and in the achievement of pupils 
taught by other teachers. The same can be said 
of teachers who effectively teach fortransfer. 
Their influence will be felt in all classes where- 
ever their pupils go; b) There is also much di- 
rect instruction in every class not directly trace- 
able to the teacher. This instruction is provided 
by other pupils, parents, and contemporaries 
generally; and c) self-teaching and learning in 
Spite of the teacher is always possible. Some of 
this may be done by self-teaching texts and ma- 
chines. 

Besides the very broad problems and diffi- 
culties here discussed, there are many technical 
problems that have already been adequately re- 
ported in the literature. 

In the brief statements here given, some of 
the more commonly experienced problems of cri- 
terion development have been reviewed. It would 
seem that while there are different approaches to 
criterion development, each has its advantages 
and disadvantages. Eachpresents very real prob- 
lems from the point of view of developing valid 
and reliable criteria. For the time being it might 
be best to look upon the several criteria as com- 
plementary rather than antagonistic. The 
strengths can possibly be preserved and the weak- 
nesses avoided by a discrete choice from among 
them according to the demands of the situation. 
In the discussion to follow, illustrative materials 
will be drawn from the several studies here sum- 
marized to further clarify the problem of criteri- 
on development. 

The series of studies here summarized as has 
already been said, employeda variety of criteria; 
some of the studies were directly concerned and 
some not with criterion development. All con- 
tribute ina manner tocriteriondevelopment, none 





the less, in that they aimed toprovide general 
orientation for the measurement and prediction 
of teacher effectiveness. One of the early studies 
not directly concerned with criterion development 
but pertinent none the less was that of Barr and 
Emans (7) in which they attempted to secure in- 
formation about the characteristics considered by 
supervisors and administrators in the rating of 

teachers. Two hundred and nine rating scales 
were analyzed. From their analysis it would ap- 
pear that the typical rating scale covered such i- 
tems as 1) Classroom management (attention to 
physical condition, housekeeping and appearance 
of room, discipline, economy of time, records 
and reports, and attention to routine), 2) In- 
structional skill (selection and organization of 

subject matter, definiteness of aims, skill in as- 
signment, attention to individual needs, skill in 
motivating work, attentionto routine, skill in di- 
recting study, skill in stimulating thought, daily 
preparation, pupil interest and attention, pupil 

participation, attitude of pupils, results), 3) Per- 
sonal Fitness (some twenty-nine items), 
4) Scholarship and Professional Preparation; 5) 
Effort toward improvement; 6) Interest in work 
and (7) Ability to work with others. From this an- 
alysis it would appear that the chiefconcern of 
the administrators in the late twenties was with 

the teacher in the classroom. 

From a comprehensive survey of the litera- 
ture, Barr (4) attempted to provide a generalized 
definition of the teaching process, in which he 
conceived of the major operations as being: 1) 
Determining pupil need (searching for causes of 
satisfactory and unsatisfactory pupil growth and 
achievement); 2) Formulating educational objec- 
tives; 3) Choosing means, methods, and materi- 
als; 4) Guiding the learning process (assuring 
favorable condition, and overseeing the process); 
5) Evaluating outcomes. 

Another attempt was made by Barr (5) toas- 
certain the concerns of those who have conducted 
research relative to the measurementand predic- 
tion of teacher effectiveness. The main categor- 
ies of concerns were found to be as follows: 


I. Personal qualities (Fourteen items) 
Il. Competencies 


A. As director of learning 

1. Skill in identifying pupil needs 

2. Skill in setting and defining of 
goals 

3. Skill in creating favorable mind 
sets (Motivation) 

Skill in choosing learning exper- 

ience 


Skill in following the learning 
process 


Skill in using learning aids 





Skill in teacher pupil relations 
Skill in appraising pupil growth 
and achievement 
9. Skill in management 
10. Skill in instruction (General) 


Il. Effects of teacher leadership (Results) 


IV. Behavior control 


A. Knowledges 

1. Knowledge of subject matter 

2. Knowledge of child behavior and 
development 

3. Knowledge of professional prac- 
tices and techniques 

4. General cultural back groups 

5. Scholarship 


B. Generalized Skills 
1. Skill in problem solving 
2. Work habits 
3. Skill in Human Relationship 
4. Skill in the use of language 


C. Interests, attitude, ideals 

1. Interest in pupils 

2. Interest in subject 

3. Interest in teaching and school 

work 

Interest in community 
Social attitudes 
Professional attitudes 
Efforts toward self-improvement 
Interests (General) 
Interests in extra-curricular ac- 
tivities studies such as this indi- 
cate the broad areas of concern 


Another source of information relative to the 
constituents ofa valid criteria of teacher effec- 
tiveness will be found in the many studies that have 
been made of the curriculum for the educaticn of 
teachers. Butler (15), Doane (18), Goldgruber 
(28), Mitchell (56), Schwahn (71) and Von Eschen 
(81) all conducted studies of the education of 
teachers. These studies contain detailed informa- 
tion about what teachers are supposed to know, 
feel, and do as teachers. The instrument em- 
ployed by Schwahn, for example, inthe collection 
of the data for his investigation, was constructed 
(after a careful examination of the literature) by 
a team of professionally competent school men 
working as a statewide cooperative committee. 
The instrument was responded to by some one 
hundred fifty-one members of the state college and 
the University of Wisconsin faculties engaged in 
teacher education. Considering the diversity of 
training and experience of the respondents, 
Schwahn’s summary constitutes an important 
source of information about the elements that 





15 


should be encompassed in a valid criterion of 
teacher effectiveness as stated by the leadership 
for the state of Wisconsin. Three ways of de- 
scribing teaching efficiency were identified; 
namely 1) character and personality traits; 2) 

desired competencies, ability todo, performance; 
and 3) controls over behavior: knowledges, skills, 
and attitudes. The relative values attached to the 
several aspecis and activities of Teacher Educa- 
tion, expressed as ranks, were summarized un- 
der the following broad categories: 


Pupil Health and Development 

General Education 

Educational Psychology 

Professional Relations 

Classroom Management 

Guidance 

Philosophy of Education 

Curriculum 

Educational Methods 

Community Relations 

Tests, measurements and evaluation 

School organization, practices, and 
support 


OWOARMBNAPWNe 


If one accepts the findings of studies such as 
those here cited as indicative of the components 
of a valid criteria of teacher effectiveness most 
of the criteria employed in theseriescf studies 
here summarized relate only tocertain aspects of 
teacher competency, and not to overall effective- 
ness. 

With the immediately preceding materials, 
defining the scope of teaching and the earlier dis- 
cussion of theoretical considerations inmind, may 
we now examine rather systematically the cri- 
teria employed inthe series of investigations here 
summarized. The criteria will not be examined 
for all of the studies but only those that seem to 
have some unique features. One of the studies 
that employs some unique approaches to criterion 
development was one by Barr, Torgerson, and 
others (8). Five criteria were employed: 1) 
Gain in pupil achievement as measured by the 
Stanford Achievement Test, the gains being com- 
puted as a) gain in the totalrawscore, b) gain in 
the arithmetic score, and c) gain in the accom- 
plishment quotient; 2) a composite of scores on 
seven rating scales; 3) acomposite of nine meas- 
ures of qualities commonly associated with teach- 
ing success; 4) a composite of six tests of teach- 
ing ability chosen from composite three; and 5) 
a composite of all nineteen variables combined 
variously. The major conclusions were: 1) The 
coefficients of correlation obtained between ten 
selected measures of teaching ability and gain pu- 
pil achievement are uniformly low; 2) the coeffi- 
cients of correlation between the nineteen varia- 
bles employed in the investigation and the five 
composite criteria provide conflicting evidence. 





16 JOURNAL OF EXPERIMENTAL EDUCATION 


The authors cite the rating scales and tests that 
seem to be most valid according to the criterion 
employed; 3) a composite of the total pupil gain 
score on the Stanford Achievement Test, The 
Torgerson Diagnostic Teacher Rating Scale, and 
the Knight Aptitude Test yielded a Multiple Cor- 
relation of .70 with a composite of all measures 
employed as criteria; 4) fourteen of the measures 
had a forecasting efficiency superior to general 
merit ratings; and 5) when the Torgerson Diag- 
nostic Teacher Rating Scale and the Knight Apti- 
tude Test were employed to predict ratings infive 
categories represented by A, B, C, D, and E 
letter ratings, 64% of the predictions fell within 
the correct criterion group, 32% were misplaced 
by one group and only 4 per cent were misplaced 
by two letter groups. 

The criteria were carefully chosen. They 
were based upon a careful study of the literature 
in the field and prolonged theorizing about the na- 
ture of teaching ability by the team that conducted 
this investigation. The superintendents respon- 
sibility with reference to the teaching personnel 
was recognized by one set of measures. From 
the point of view of school organization and ad- 
ministration this must always be, whether valid 
or not, a primary criterion. The teacher rating 
scales were carefully drawn and were the best to 
be found. There were six of these. The cri- 
terion of tests of qualities thought to be associated 
with teacher effectiveness were those that seemed 
to be best in terms of the measures that were then 
available and supported by data in the literature. 
There were nine of these, relating to character- 
istics hypothesized to be important in teacher ef- 
fectiveness. The pupil gain scores were based 
upon the Stanford Achievement Test. While, as 
pointed out by the authors, the criteria employed 
are not without inadequacies, arising in part out 
of the invalidities and unreliabilities of the sever- 
al measures employed, taken individually, but 
arising also in part from other inadequacies, the 
study does point up certain matters that one should 
keep in mind. One of the findings is that different 
criteria measure different aspects of teacher ef- 
fectiveness. Based upon data such as these and 
considering our present state of our knowledge, 
many persons believe that if one desires anover- 
all criterion of teacher effectiveness, probably 
the safest procedure is toemploy a variety of 
measures, all possessing validity from some par- 
ticular point of view, applied and evaluated by 
more than one person, and based upon studies of 
the teacher over a considerable period of time. 
If one does not desire an overall competency eval- 
uation, the several components may still be con- 
sidered separately. If one examines the correla- 
tions from this point of view it can be observed 
that none of the correlations with the pupil gain 
criterion are high withabout as many negative cor- 
relations as positive ranging from -.32 for the 





Giles Teacher Rating Scale to .23 for the Wood 
Health Scale. The only correlation of any size 
(.63) is between a General Merit Rating anda com- 
posite score on six Teacher Rating Scales which 
is probably due to a halo effect. When acompos- 
ite of nine measures of qualities thought to meas- 
ure teaching ability was used as the criterion all 
of the correlations were around zero . 08 for pu- 
pil gain, and -.06 to .12 for the six teacher rat- 
ing scales. When the criterion is a composite of 
all nineteen measures taken either singly or as 
three equally weighted components the highest 
correlations, ranging from . 74 to . 84 were with 
the teacher rating scales arising chiefly out of the 
halo effect arising from the fact thatthe same 
persons made the teacher ratings. Very clearly 
the different criteria give different results. One 
can make differential predictions but the effec - 
tiveness of a prediction depends upon the crite- 
rion. 

Another illustration ofthe multiple global ap- 
proach to criterion development will be found in 
a study by Lins(49). He employed three sorts of 
criteria: 1) supervisory ratings; 2) pupil evalu- 
tion; and 3) residual pupil gain. The supervisor 
rating criterion was a composite of five ratings: 
two by members of the Department of Education, 
University of Wisconsin, one by a representative 
of the State Department of Public Instruction, one 
by the principal or superintendent under whom the 
teacher taught, and one by a member of the de- 
partment of educational methods under whom the 
teacher had done her practice teaching. The rat- 
ings were all made on the Wisconsin adaptation of 
the M-Blank of the Evaluative Criteria plus cer- 
tain other instruments described in the original 
report. The lowest correlations of any one group 
of individuals with the composite was .72 + .08 
and with other individuals was .403 +.121. The 
reliability of the criterion determined by the 
chance halves method and the Spearman-Brown 
prophecy formula was .862. The ratings were 
preceded by almost one full year of weekly dis- 
cussions among the members of the team respon- 
sible for the collection of the data and each teach- 
er was visited by a trained observer atleast once 
for more than a Single class period, usually a half 
school day, in which she was observed at work 
and interviewed. The pupil evaluations were all 
made according to a carefully designed plan and 
under the direction of a single person who visited 
each school and secured the evaluation by anony- 


mous ballots. The residual pupil gains were the 
discrepancies between the actual gains made upon 
certain standardized subject matter tests, de- 
scribed in the report, and a predicted pupil gain 
based upon a four-variable prediction equation de- 
rived from average gain scores, the pretest 
score, mental age, and an intelligence quotient. 
These criteria were established with great care. 
One of the most important findings was the low 





intercorrelations among the three criteria. These 
correlations were as follows: 


x x x 


1 2 3 
X; Composite M-Blank 
ratings 1.000 .279 .193 


Xq Pupil evaluation .279 1.000 .055 


X_ Residual pupil gain .193 .055 1.000 


In this investigation the three criteria were 
not combined into a single global criterion but 
each was treated separately. Some teachers were 
preferred by the administration, some were liked 
by the pupils and some taught in classes where 
there were substantial pupil gains in terms of the 
tests employed to measure pupil gains, and gen- 
erally speaking these were not the same teachers. 

These findings are in substantial agreement 
with those of Barr and others (8) previously 
cited. Many of the correlations are quite low, 
but enough statistically significant correlations 
were found to give multiple R of .74 with the re- 
sidual pupil gain: . 72 with a composite of super- 
visory ratings and .60 with pupil evaluation of the 
teacher. Obviously many things contribute to the 
low zero order coefficients of correlation. These 
low coefficients of correlation arise in part be- 
cause of the discrepancies that arise between 
theory and practice, i.e. , between the ideal and 
the actual, that have been frequently noted in the 
literature. Some would correct this situation by 
narrowing the area of investigation and by intro- 
ducing better controls. This is a very worthwhile 
concern but not the problem to which this particu- 
lar investigator addresses himself. His concern 
was that of predicting teacher effectiveness when 
viewed by a variety of persons, under varying 
conditions, and where more than one sort of cri- 
teria are employed. For those who wish to make 
prediction under carefully controlled conditions, 
the very great number of experimental studies of 
learning and teaching methods reported in the lit- 
erature provide a gold mine of tested materials of 
this sort, especially if it is teaching and learning 
efficiency that is the investigators concern. This 
is especially true if a behavioralapproachis 
made to teacher effectiveness and method is 
broadly conceived to encompass total behavior. 

A somewhat different approach to criterion 
development was made by Rostker (66). In the 
first place he limited his study to the teaching of 
citizenship in the eighth grade of non-departmen- 
talized one-teacher schools. Pre- and post-tests 
purporting to measure the outcomes of sucha 
course were administered near the beginning of 
the school year and six months later plus pre and 
post unit tests. The tests were all given by trained 
examiners. The Wrightstone tests of ‘‘Applying 








17 


Generalizations to Social Studies Events.’’ ‘‘ Abil- 
ities to Organize Research Materials,’’ and ‘‘A 
Scale of Civic Beliefs’’ and the Hill Test in Civic 
Attitudes, and the Hill Test in Civic Information 
were administered to all pupils. An attempt was 
made to measure some of the less tangible out- 
comes of citizenship instruction. Besides these 
overall measures each teacher was asked to two 
different units of subject, the same forall teach- 
ers with pre and post instruction tests. 

In order to calculate the residual pupil gains 
the Kuhlmann-Anderson Intelligence Test, The 
Traxler Silent Reading Text, and the Sims Score 
Card for Socio-Economic Status were adminis - 
tered. Prediction equations were developed for 
the eight separate measures of pupil gain and for 
a composite of the several measures that pur- 
ported to measure factors conditioning achieve- 
ment. The statistical treatment of the data was 
reasonably sophisticated. The several criteria 
are treated in such a manner as to make it pos- 
sible to observe the particular strengths and 
weaknesses of each teacher, were one interested 
in the problem of differential prediction of abili- 
ties or the education of teachers inservice. Some 
researchers in this area believe that differential 
prediction is an important matter since teachers 
cannot be equally competent in all respects as is 
shown by the data. At least part of the problem 
would seem to be that of ascertaining the respects 
in which teachers are strong and weak. 

Rostker (66) gave much attention to the de- 
velopment of an adequate criterionofteacher ef- 
fectiveness. In the first place he attempted 
through the use of the Hill and Wrightstone Tests 
to measure something more than informational 
learning. Then, he attempted to measure some 
short time achievement and some longer time 
outcomes. He, also, attempted to isolate some 
of the pupil influences on learning througha mod- 
ified analysis of co-variance. Among his findings 
which seem important for criterion building are 
these: 


1. The intercorrelation between the raw gain 
scores on the unit achievement tests for two very 
carefully standardized units which all teachers 
taught was . 14; the correlation between the com- 
posite unit raw gain score and the Wrightstone 
raw gain score composite was . 02 andthe corre- 
lation between the unit raw gain score composite 
and the Hill raw gain score composite was . 12as 
was the correlation between the Wrightstone and 
Hill raw gain scores. For the adjusted composite 
gain scores the correlation between the unit scores 
and the Wrightstone scores was .38; between the 
unit scores and the Hill scores .61;and between 
the Hill scores and the Wrightstone scores . 35, 
which all seems to indicate, along with errors of 
measurement, these composites measure differ- 
ent things. 





JOURNAL OF EXPERIMENTAL EDUCATION 


2. Rostker does not give the coefficients of 
correlation for the several different criteria but 
for the single one over all criterion. For this 
criterion, only seven of the thirty dependent var- 
iable scores gave coefficients of correlations of 
. 30 or more as follows: 


1. The Torgerson Diagnostic Teacher 
Rating Scale . 43 
The Michigan Teacher Rating Scale .39 
The Hartman, Social Attitudes Test .38 
Almy-Sorensen Teacher Rating Scale . 36 
Personal Fitness (Based upon the 
Charters list) 38 
Size of School (Number of Pupils) eS | 
A Wisconsin Personality Test . 30 


It is interesting to note that, except for the 
Torgerson Diagnostic Teacher Rating Scale, the 
size of the school was about as effective as any- 
thing else in predicting teacher effectiveness as 
here measured. 

Anderson (1) chose as the major purpose of 
his research a study of the criterion itself. His 
analysis is based upon data gathered through the 
use of the following data-gathering devices: 


1. The principals rating of the teacher upon 
the Wisconsin Adaptation of the M-Blank 
(M) 

A rating of the teacher by pupils on an 
adaptation of the Bryan Reaction to 
Instruction Questionnaire 

A teacher self rating upon a special form 
(Se) 

A rating of the teacher by his peers upon 
a special form (Pe) 

A state supervisory rating (Ax) 

Pupil score on an initial administration of 
standardized achievement tests (P. T. ) 

7. Final scores on selected standardized test 
(F) 
8. Residual pupil gain (AG) 


The intercorrelations among these criteria are 
shown at the top of the next page. 


Among Anderson’s conclusions one finds state- 
ments like the following: 


1. No correlations appreciably different from 
zero were found between the evaluations of 
teachers made through the several ratings 
and pupil achievement as measured in this 
investigation. 


The achievement tests employed insuch in- 
vestigations must have curricular validity 
and the pupil gains must be large enough to 
be statistically significant. 





3. Most ratings are very subjective and the 
intercorrelations among them low. 


Anderson’s findings seem tobe in substan- 
tial agreement with those of Lins (49) and Barr, 
Torgerson and others (6). 

A number of the studies employed ratings in 
one form or another as the sole criterion of 
teacher effectiveness. Hampton (33) for example, 
used the Estimate of Teacher’s Qualifications 
Based on Teaching Experience, used at lowa 
State Teachers College as aninstrument toassess 
the success of Iowa State Teachers College grad- 
uates. For her study she chose from this blank 
twelve items as follows: 1) cooperation and loy- 
alty, 2) knowledge of subject matter, 3) cour- 
tesy and friendliness, 4) interest in school ac- 
tivities, 5) discipline, 6) emotional poise, 7) 
general culture, 8) healthand vitality, 9) personal 
appearance, 10) resourcefulness, (11) response 
to criticism, and 12) speech. These character- 
istics are among those frequently found in blanks 
of this sort. 

From intercorrelations and a factor analysis 
she concluded among other things that: 


1. A general factor did not appear toaccount for 
the intercorrelations. Whenthe same instru- 
ment was used more than once on the same 
teachers, three factors seemed to account for 
the intercorrelations but none of the factors 
appeared to be consistent in all three rota- 
tions. 


The correlations between successive trait 
ratings of the same persons were Significant- 
ly different from zero when the raters were 
the same but zero when the raters changed. 


The trait ratings on a five-point scale and on 
a paired-comparison rating were all posi- 
tively correlated with nine out of thetwelve 
correlations statistically significant. 


The high correlations between the individual 
traits ratings and the.general category rating 
seems to indicate that the individual trait rat- 
ing added little to our knowledge. 


The high correlations between the individual 
ratings on Resourcefulness and Knowledge of 
of Subject matter with general merit ratings 
on the several instruments, would seem to in- 
dicate that these are more characteristics of 
good teachers, in the minds ofsuperintend- 
ents, with Healthand Courtesy negatively cor- 
related with teacher efficiency. 


A number of the studies of this series em- 
ployed the Wisconsin Adaptation, of the M Blank 





DM's 
es 8 


Sseeer?> 
au) 
> pO 


Qa 


of the Evaluative Criteria, with a single school 
rating by the principal or superintendent, or mul- 
tiple ratings involving other professional person- 
nel, as the criterion. The blank represents the 
combined efforts of many persons. There is no 
attempt in this blank to be objective. The blank 
merely attempts to remind the rater of some of 
the more important aspects of teaching, and 
teacher characteristics, that he should keep in 
mind in making his evaluation, with space for 
Supporting evidence. The major areas of concern 
are expressed as questions as follows: 


Teacher as a Director of Learning 


1. Is the teacher well prepared? 

2. Has the teacher the personal prerequi- 
sites to do effective work? 

3. Are the goals and objectives set by the 
teacher to pupils adequate? 
Are the observed teacher and pupil ac- 
tivities well chosen? 
Does the teacher show skillin directing 
activities ? 
Are the aids to learning as observed ad- 
equately and effectively used? 
Are the methods of checking results 
satisfactory ? 
Is there evidence of desirable pupil 
growth and development ? 


The Teacher as a Friend and Counselor of 
Pupils 


1. Is the teacher interested inthe pupils as 
persons ? 

2. Does the teacher givetime to the pupils 
as individuals? 

3. Does the teacher contribute to whole- 
some pupil growth and adjustment? 


Ill. The Teacher asa Member of a Group of 
Professional Workers 


1. Does theteacher willingly and effective- 
ly participate in the activities essential 
to a good all-around school program ? 





.09 ---- 
00 .59 
54 


Does the teacher willingly and effective- 
ly participate in the activities essential 
to his own professional growth and im- 
provement ? 

Does the teacher willingly and effective- 
ly participate in the activities essential 
to professional solidarity and welfare? 


Teacher as a Member ofthe Community 


Does theteacher show concern over com- 
munity needs, plans, and activities? 
Does the teacher show evidence of un- 
derstanding, social needs and processes 
and the schools’ relation to these? 

Does the teacher foster democratic at- 
titudes and relationships ? 

Is there evidence that the teacher has 
made a constructive contributionto com- 
munity welfare and development ? 


Each category has a number of sub-questions 
and a five point rating scale: outstanding; above 
average; average, below average and poor. 
There is a place for an overall general merit rat- 
ing on a five point scale and a place to indicate 
strengths and weaknesses. The scale has had ex- 
tensive use. 

Another approach to criterion development is 
Manwiller’s (52) study of the expectancies of school 
board members regarding teachers. The subjects 
were 391 high school teachers and 134 members 
of boards of education. Among his conclusions 
are these: 


1. The areas of religious life seem to constitute 
the major area of intergroup agreement; the 
areas of economic and civic life present agree- 
ment to a lesser degree; and the areas of per- 
sonal-family and social-recreational life the 
least. 


Some differences of opinion were found in 
every school district between teachers and 
board members on the behavior theythought 
the community expected of teachers. 





JOURNAL OF EXPERIMENTAL EDUCATION 


Morale, satjsfaction, instructional efficiency, 
and the instruction of teachers depends upon 
more attention to teacher-community relation- 
ships. 


Studies such as this add new dimensions to 
criterion development. 

Erickson (24), Hellfritzsch (34), Lamke (46) , 
and Schmid (70) attempted to clarify the situation 
with reference to criterion development thr ough 
the use of factor analysis techniques. Hellfritzsch, 
using data collected in cooperation with Rostker 
(66), Matthews (54), Rolfe (65), and Gotham (30) 
found common factors as follows; a) general 
knowledge and mental ability; b) a teacher rating 
scale factor; c) personal, emotional and social 
adjustment; d) eulogizing attitude toward the 
teaching profession; e) possibly some aspect of 
intelligence, and f) atendency toward research as 
contrasted with administration. Ina second fac- 
tor analysis he found the following common fac - 
tors: a) general knowledge and mental ability; b) 
a teacher rating scale factor;c) personal-emo- 
tional adjustment; d) a eulogizing attitude toward 
the teaching profession; e) a teaching ability fac- 
tor; and f) possibly a residual error. This analy- 
sis adds several bits of information to our know- 
ledge. In the first place the findings for the two 
analyses provide parallel data from somewhat dif- 
ferent measures and a different population. The 
rating scale factor is again isolated andseems to 
be something apart from the other criteria. The 
relation of residual pupil gain seemsrelated to 
the teachers general knowledge and mental ability 
in one analysis, and an independent factor in the 
other. The personal and emotional adjustment 
factor was present in both analyses. Studies such 
as these give a better insight into the structure of 
criteria. 

Erickson (24) studies forty-two factors: a) 
nine estimates of teaching success made onthe 
basis of ratings by the principal,a state super- 
visor, peers, pupils, and a self rating; b) seven 
measures from the Thurstone Temperament 
schedule; c) sixteen measures of personality based 
upon the Cattell sixteen personality factors; and 
d) ten measures of preservice achievement. Each 
of the four groups of measures w ere factored sep- 
arately. The author concluded that if the factor 
scores (estimates of three factors) are considered 
as eStimates of teaching success we find that 
there are eight ‘‘good’’ teachers and seven 
‘‘poor’’ teachers as determined by composite one; 
ten ‘‘good’’ and eleven ‘‘poor’’ teachers as iden- 
tified by composite two; and nine ‘‘good’’ and e- 
leven ‘‘poor’’ teachers are identified by composite 
three. He further concludes that the generally 
low intercorrelations among the several temper- 
ament, personality, and achievement measures 
throw doubt upon their use as measures of teach- 
er efficiency. He finally concludes that there was 


no general factor but three separate measures, 
each of which might be used separately in a pro- 
gram of differential prediction. Allofthis seems 
to underline the difficulties involved in criterion 
development. 

Lamke (46), concluded that the responses of 
‘‘pood’’ and ‘‘poor’’ teachers did not fallinto two 
well-defined andcharacteristic patterns as 
measured by the Cattell Sixteen Personality 
Factor Test, but that several patterns exist for 
good teachers and probably several for poor 
teachers. Schmid (70), using the Washburne So- 
cial Adjustment Inventory, the Mooney Problem 
Check List, and certain scales of the Minnesota 
Multiphasic Personality Inventory found different 
patterns for male and female subjects. All of 
these factor analysies serve to underline the com- 
plexity of criterion development. 

Besides the factor analyses, long time fol- 
low-up studies were made by Brandt (13) and 
Briggs (14). Brandt conducted a follow-up study 
of four prior studies made three to six years pre- 
viously, to ascertain whether their findings might 
be reaffirmed in sucha follow-up study. The 
studies investigated were those of LaDuke (45), 
Rostker (66), Jones (42), and Lins (49). The cri- 
terion under particular scrutiny was the M-Blank 
ratings. A follow-up was made for the four stud- 
ies separately. While the correlations were not 
high, in each instance the predictions improved 
with time. The author believes that whatever the 
M-Blank measures it represents something fair- 
ly well established in our professional moves. 

Briggs (14) made a ten year follow-up study 
of the 1943 group studied by Lins and others (49). 
Using the Principal ratings as the criterion he 
found among other things that none of the meas- 
ures used in the earlier studies correlated with 
principals rating sufficiently to have predictive 
values. On the surface his findings seemto be 
in conflict with those of Brandt (13). 


Concluding Remarks 








Each investigator has defined his problem 
and criterion of teacher effectiveness according 
to his interest and perception of the problem. 
Most of the studies employ a global sort of cri- 
terion; some employed a more restricted crite- 
rion. Yet other investigators may have other in- 
terests and perceptions of the problem. These 
interests may be of severaltypes: a) The interest 
of the investigator may be in the evaluation and 
prediction of teacher effectiveness in particular 
situations, as for example, a superintendent of 
schools, b) The interest of the investigator may 
be in general teaching ability ina selected ar- 
ea of subject matter specialization, i.e., the a- 
bility to perform effectively in a variety of situa- 
tions with different sorts of pupils with different 
needs; and c) the interest of the investigator may 





be in the measurement and predictions of teach- 
er effectiveness in attaining some limited goalor 
goals of learning and in teaching in specified sit- 
uations with particular types of pupils. Thereare 
many combinations of these things. The investi- 
gator should be clear as to his concernand the 
criterion appropriate to his problem. 


;; 


Whatever the investigator’s concernthereare 
various criteria that one may employ. Those 
most frequently employed inthis series of in- 
vestigations are: a) Ratings of teacher effi- 
ciency based upon the observation of teacher 
behavior made by a single individual orby 
several individuals and by different sorts of 
persons: superintendents, principals, super- 
visors, college professors, students, and 
peers; b) Scores on tests of qualities, abili- 
ties and competencies thought to be associated 
with teacher efficiency; and c) Products, usual- 
ly residual pupil gain after the effects of non- 
teacher effects have been randomized or taken 
into consideration by regression techniques. 
Each has its strengths and weaknesses. 


From a Slightly different point of view it can 
be said that there are different ways ofde- 
scribing teacher efficiency and thus in this 
sense different approaches to criterion devel- 
opment. a) one might study the qualities of 
the persons thought to be essential to teacher 
efficiency; b) one might study teacher and pu- 
pil behavior; c) one might study the psycholog- 
icalprerequisitesto teacher efficiency; 
knowledges, skills and attitudes; andd) one 
might study the product, pupil growth and a- 
chievement, and other effects. Each of these 
has its particular strengths and weaknesses. 


In criterion development one employs various 
sorts of data gathering devices, and the choice 
and validation of data gathering devices canbe 
made with varying degrees of sophistication. 
The essential steps for the validationof data 
gathering devices are pretty well set forth in 
the literature on measurement and test con- 
struction. The investigations here summarized 
showed varying degrees of sophistication in 
this respect. 





21 


The criteria of these investigations are fre- 
quently composite criteria. The building of 
composites presents many problems. First 
of all there is the matter of whether to com- 
posite or not composite a given set of scores. 
Aside from the fact that a composite may not 
serve ones purpose there is the problem of 
non-additive scores that willappear in the 
consideration of certain data. The non-addi- 
tiveness arises first out of the absence of zero 
points on our measuring scales and further, 
out of theories of structuring human abilities. 
Then if one does addthere will be the problem 
of the weighting of components and calculation 
of the resulting error of measurement. 


The statistical treatment of data canvary 
greatly. Inthis series of studies a very 
great variety of statistical techniques were 
employed ranging from the very simplest a- 
rithmetical calculation to the most compli- 
cated statistical treatment. In general there 
is an absence of sampling techniques. This 
was thought not inappropriate since the stud- 
ies were in the main exploratory and studies 
of total population of students for the most 
part graduatedfrom the University of 
Wisconsin. 


In examining the various criteria of effective- 
ness employed in these investigations, 
one observes the usual discrepancies between 
theory and practice. Some give very close at- 
tention to one aspect of criterion development 
and forget another, or consider it less impor- 
tant, or just run out of time, or money, or 
energy. 


Obviously, prediction studies are no better 
than the criteria employed or the data gather- 
ing devices with which the data are collected. 
Carefully defined populations, carefully cho- 
sen data gathering devices, well thought out 
research designs, sophisticated statistical 
treatment of data, and welldeveloped criteria 
are essential to good research in this, as in 
other fields of research. 





JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, September 1961) 


CHAPTER Il 


THE METHODOLOGY OF THE INVESTIGATIONS HERE SUMMARIZED 


A. S. BARR 


The investigations here summarized em- 
ployed a very great variety of data-gathering de- 
vices, research designs, and statistical proced- 
ures. In general, these investigations are ex- 
ploratory, descriptive investigations conducted 
with reference toa rather definite set of purposes. 
Avery large number of these studies were fol- 
low-up studies undertaken asa part of the teach- 
er education program at the University of Wiscon- 
sin. Over a period of many years the University 
of Wisconsin has been interested in the develop- 
ment of better programs for the education of 
teachers. One way to improve such programs is 
to conduct careful studies of the product. Many 
institutions have conducted such follow-up studies. 
Some of the studies here summarized were quite 
elaborate, while some were fairly simple, de- 
pending mostly uponquestionnaire data. Follow- 
up studies are frequently criticized for their su- 
perficiality but they need not be necessarily so. 
They can be, andneedto be on occasion, as com- 
plicated as the science of education itself. As 
follow-up studies, the studies here summarized, 
dealt almost wholly with total populations. 

Another fairly constant purpose running 
through these studies arises out of a need that 
those preparing and administering admission pro- 
grams, teacher education programs, placement 
programs, follow-up programs, and in-service 
teacher education programs have felt for more 
reliable information about the prerequisites to 
teacher efficiency. None of these activities can 
be satisfactorily performed without more precise 
information about these prerequisites and how to 
identify them. Another fairly constant purpose 
running through these investigations arose out of 
the fact that many persons make predictions about 
teacher effectiveness (that may come about at 
some future date), depending upon various sets 
of prerequisites that are assumed to exist, but 
need to be verified. Admission officers, for ex- 
ample, deal with a particular set of prerequi- 
sites, built partly upon science and partly upon 
tradition that need to be checked. Admission 
officials make predictions which say in effect that 
those not meeting these standards are poor risks 
and they should not be admitted to the program of 
teacher education. Teacher educators say, at 
least in the case of required courses and curricula, 
that these courses and curricula are a necessity 





and students doing well in these will do better as 
teachers than those not doing so well or not tak- 
ing such courses. These are predictions and 
these predictions need to bechecked. Placement 
officials and employing officials must not only 
know what a good teacher is but must know how 
to identify one and predict her future efficiency. 
In-service teacher educators have similar needs. 


The Methodology of These Investigations 





In keeping with the general long-time goals 
of this series of investigations, these studies are 
mainly descriptive. Within this frame of refer- 
ence a variety of the more conventional classical 
types of researchdesignswereemployed. These 
seemed to serve the exploratory character of 
these investigations adequately. The sam pling 


types of research have been reserved for future 
research. This course was not followed because 
of a lack of appreciation ofthese newer multivar- 
iate sampling techniques (asa matter of fact, the 
writer has taught such acoursefor many years), 
but from a deepconvictionthat the intelligent use 
of these depends not merely upon their mathemat- 
ical foundations, but needs tobe preceded by ex- 
perience with thethought processes of research, 
a better appreciation of what the problems are 
that need to be solved, theoretical orientation in 
the psychology of learning andteaching, and first 
hand knowledge of teacher and teaching. 

Many of these studies were status studies, 
not inthe sense of something superficially studied, 
but in the sense of studies that sought to find out 
what exists. Most studies in psychology and ed- 
ucation are in a sense status studies. Sampling 
studies are for the most part status studies, i.e., 
insofar as they use information about samples to 
draw inferences about the status of something in 
a defined population from which the sample is 
drawn. 

A large amount of time was spenton these 
studies in attempting to learn more about the con- 
stituents and concomitants of teacher effective- 
ness. Among the status studies are some eight 
or ten questionnaire surveys. Doane (18), Mitchell 
(56), Goldgruber (28), and Schwahn (71), for ex- 
ample, conducted studies of the course require- 
ments and needs of teachers in undergraduate 
teacher education programs. Kline (43) studied 





the annoyances and satisfactions attending teach- 
ing; Manwiller (52) studied the expectancies of 
school board members with reference to the per- 
sonal livesof teachers; Martindale (53) and Knox 
(44) studied the situational factors that appeared 
to be related to job satisfaction; Eustice (25) 
studied the experience background of teachers and 
its relation to teacher effectiveness; Golden (29) 
studied the prerequisites to teacher effectiveness 
through the use of the critical incidence tech- 
nique; and Mann (51) conducted a follow-up study 
of University of Wisconsin graduates to ascertain 
their after-graduation successes and failures. 

These studies have beencatalogued as ques- 
tionnaire survey studies, but others may prefer 
other designations. For many persons an instru- 
ment to be called a questionnaire must be super- 
ficially and hurriedly prepared by an educational 
novice; but they neednot be so. The data-gather- 
ing devices employed in these investigations re- 
gardless of whether they lived up to expectation 
were carefully prepared. The data-gathering 
device used by Goldgruber (28) and Schwahn (71), 
for example, was prepared by a statewide com- 
mittee, after a careful survey of the literature, 
and one full year of committee effort. The chief 
data-gathering de vice employed by Eustice (25) 
is designated a Data Booklet. This booklet, some 
twenty pages in length, was the subject of study 
by a seminar group throughout the greater part 
of an academic year. It contains questions about 
many of the things that people have asked about 
teachers in past investigations. Most of the in- 
struments employed in these investigations were 
of a highly structured sort, put intowrittenform, 
with careful attention to coverage, to the wording 
of questions, and to the likelihoodof getting ac- 
curate information. Some would not call these 
and other similar data-gathering devices ques- 
tionnaires, but the studies were status studies, 
nonetheless. 

There were other normative survey types of 
investigations that proceded somewhat different- 
ly. Barr and Emans (7), for example, analyzed 
209 teacher rating scales to ascertain what those 
constructing such scales would have those who 
use them look for in teacher evaluation; Fleming 
(27) examined certain issues of wide circulation 
news papers to ascertain the concerns that the 
public might have aboutteachers. Thiede (76) 
studied the undergraduate achievement records 
of students enrolled inthe several schools and col- 
leges at the University of Wisconsin. Blum (10) 
and Eichstaedt (22) comparedthe mental, social, 
and emotional characteristics of students enrolled 
in other schools and colleges of the University; 
Guiles (32) studiedthe practices, conditions, and 
trends in the Wisconsin State Teachers Colleges 
in relation to functions; Lien (48) studied the stu- 
dents enrolled in the four curricula in the State 
Teachers College, Whitewater; Nemec (58) 





23 


studied the relation between teacher certification 
and teacher education in the state of Wisconsin; 
Russell (68) studied the intellectual and affective 
characteristics of experienced teachers pursuing 
a masters degree program in teacher education; 
Stevens (74) and Johnson (38) studied the attitude 
of high-school seniors toward teaching as a career; 
Vertein (80) studiedthe personal, social, and in- 
tellectual characteristics of undergraduates en- 
rolled in courses for teachers at the Wisconsin 
State College, Platteville. 

There were a limited few experimental 
studies. Draves (21) studied class size and in- 
structional methods inrelationshipto certain stu- 
dent characteristics; Emans (23) and Von Eschen 
(81) studied programs for the in-service educa- 
tion of teachers; Lange (47) studied concept de- 
velopment by students enrolled in certain under- 
graduate courses in psychology and practice teach- 
at the University of Wisconsin, and Pitts (59) 
studied the effectiveness of different methods of 
organizing and directing student experiences in 
practice teaching. 

Many of the studies pertained to the validity 
and reliability of various instruments, and pur- 
ported to measure teacher effectiveness. Barr 
(8), Torgerson (78), Lyons (50), Johnson (39), and 
Walvoord (83), working as ateam, studied the 
validity and reliability of a large group of meas- 
ures; Rieck (62) and Flanagan (26) studied the ap- 
plicability of the Minnesota Multiphasic Personal- 
ity Inventory to undergraduates enrolled at the 
University of Wisconsin; Jackson (36) constructed 
and validated a social proficiency scale; Mathews 
(54) made an item analysis of certain measures 
that purported to measure teacher effectiveness; 
Reitz (61) studied the intelligence of students en- 
rolled in teacher education programs, and Rudi- 
sill (67) developed a scale for measuring the 
teacher’s personality. 

A number of the studies were concerned with 
the classroom behavior ofteachers. Barr (3) and 
Jayne (37) made detailed studies of the verbal be- 
havior ofteachers. Tiedeman (77) studied teach- 
er-pupil relations; Brookover (12) studied the re- 
lation of certain social factors to teacher effec- 
tiveness. Bollinger (11) studied the social impact 
of teachers upon pupil character and personality; 
Singer (73) made a rather complete survey of the 
sociometrics ofthe classroom. Many of the 
studies linked the teachers’ classroom behavior 
to other measures of teacher effectiveness. 

A number of the studies contrasted the be- 
havior and other characteristics of good and poor 
teachers. Barr (3) andCarlson (16) employed the 
logical principle of double agreement in their re- 
search design in searching for the characteristics 
of good teachers. Lamke (46) and Jones (40) used 
this design as apartofa more inclusive research 
design involving the use of the discriminative 
function. 





JOURNAL OF EXPERIMENTAL EDUCATION 


Several of the investigators used case studies 
incidental to other designs, but no investigator 
employed this method as the principle research 
design. Brandt (13) and Briggs (14) conducted 
long-time, in-service, follow-up studies. Many 
of the individuals were studied from three to five 
years, and some for longer periods of time. 

A considerable number of these investiga- 
tions were of the multiple regression prediction 
type. The studies by Rostker (66), Rolfe (65), 
La Duke (45), Gotham (30), and Hellfritzsch (34) 
were ofthe pre-prediction type, i.e., the test 
scores in these studies were all obtained from in- 
service teachers. These studies were the fore- 
runners of the Lins (49), and Von Haden (82) 
studies in which the testing program started with 
the group whenthey were freshmen and continued 
throughout the four-year undergraduate program 
and into theirfirst yearsofteaching. The Ander- 
son (1), Erickson (24), Lamke (46), Schwartz 
(72), Montross (57) studies were all of the multi- 
ple regression prediction type. There are other 
studies that make lesser use of regression tech- 
niques. 

The factor analysis technique was used ina 
number of the investigations. Hellfritzsch (34) 
compared the factor pattern of two sets of data; 
Schmid (70) compared the factor patterns of men 
and women teachers; Ringness (64) employed fac- 
tor analysis in a study of the reasons given by 
students for achoice of teaching as a major inter- 
est; Hampton (33) employed factor analysis tech- 
niques in analyzing the ratings of Iowa teachers; 
Erickson (24) used factor analysis techniques in 
the study of three criteria of teacher effectiveness. 
Lamke (46) used factor analysis techniques in the 
study of the personality characteristics of teach- 
ers, andJones (40) used factor analysis as an ad- 
junct to her use of the discriminative functions. 

Accordingly, then, a variety of research de- 
signs were employed by the investigations here 
summarized, mostly of the descriptive nonsam- 
pling type. In general, these investigations are 
of the type described by the early volume on re- 
search techniques such as the Good, Barr, Scates 
Methodology of Educational Research. The de- 
signs seemed quite adequate to the purposes of 
these investigations. As exploratory investiga- 
tions they probed almost all possibilities in light 
of the prevailing concepts of the times in which 
they were made. Each investigation was preced- 
ed by an extensive survey of the literature; each 
doctoral candidate was guided by a committee of 
three faculty members; each study represents a 
careful attempt at critical thinking. As such, 
even as a source of expert opinion, these studies 
constitute an important source of hypotheses, 
facts, and materials for further research in this 
field. 

On the short end, there are many inadequacies 
in these studies that need not be repeated: 








1. Teaching is nowhere adequately defined. 
There are materials in these studies for such a 
definition but no definition is made. 

2. The aggregate of vocabulary employed in 
these investigations is overwhelmingly extensive. 
Some of this vocabulary is defined, but most is 
used with the assumption that the current under- 
standing of these terms is adequate. 

3. These investigations are, in the main, of 
the uncontrolled variety. Such natural events in- 
vestigations are needed but in the absence of ar- 
tificial controls, there must be a careful descrip- 
tion of events or measurement of important factors 
and influences. There was extensive measure- 
ment of factors but inadequate description of 
events. 

4. Many of these studies were correlation 
studies. Nowhere does there seem to be recog- 
nition of the fact that coefficients of correlations 
cannot be taken at face value. 

5. While many of the criteria were most 
carefully defined and developed, many of the 
studies employed efficiency rating which does not 
constitute an adequate criterion of teacher effec- 
tiveness even though it must be accepted as ade- 
quate if the investigators’ interest is in the su- 
perintendents’ evaluation of teachers. 

6. These investigations, generally speaking, 
accept the current elementalist factor concept of 
the constituents and concomitents of teacher ef- 
fectiveness. They may be quite correct but this 
possibly needs further investigation. 


The Teachers Studied 





The choice of subjects for the series of studies 
herein summarized was closely tied to the pur- 
poses of theseinvestigations. Many ofthe studies 
were, as has already been said, follow-up studies 
of graduates of the University of Wisconsin un- 
dertaken to gain more information about the prod- 
uct, their successes and failures, and the factors 
contributing to their effectiveness. These Uni- 
versity of Wisconsin graduates were a highly se- 
lected group. According to admission policies, 
these prospective teachers were first of all from 
the upper half of the high-school class from which 
they were graduated and from those who had 
scores above the median on the Henmon-Nelson 
Psychological Examination. They were also rec- 
ommended by the principal of the school from 
which they were graduated. Further restrictions 
were placed upon the group at the time of their 
admission tothe School of Education at the begin- 
ning of the junior year. Under a system where 
the grade point average was determined by weights 
where an A received a weight of 3, B a weight of 
2, C a weight of 1, and D a weight of 0, each stu- 
dent was required to have a minimum 1.5 grade 
point average, which was above the University 
average for all students. Because of admission 





restrictions the average grade point for School of 
Education Juniors and Seniors was somewhat 
above that of other schools and colleges of the Un- 
iversity. The major characteristics of this group 
as shown by Lins are summarized in Table I. 

Keeping in mind that there are two sorts of 
scores in Table I, namely T-scores and grade 
point averages, it can be seen that the means for 
the group were about one-half sigma above the 
mean for comparable groups. The mean of the 
T-scores and forthe grade point were around 
1.25. Many studies of the intelligence of profes- 
sional groups reported in the literature, for ex- 
ample, would place teachers below the college 
mean in intelligence. Not onlyis this group 
somewhat above the average of the University in 
these respects, but the range hasbeen narrowed 
to about three sigmas, i.e. , one andone-half sig- 
ma below and one and one-half sigma above the 
mean. Thus neither extreme is well represented. 
Not all studies were follow-up studies of University 
of Wisconsin graduates, but this group appears 
to be fairly representative of other university 
groups for the period of time encompassed by 
these studies. Comparative data provided by sub- 
sequent investigators appear to support the rep- 
resentativeness of these data. 

In a further study of the characteristics of 
students enrolledinthe School of Education at the 
University of Wisconsin, Thiede (76) found that 
the scores for women were significantly higher 
than those for men and that when the scores for 
women (there are at the University of Wisconsin 
more women than men certified to teach) were 
compared with those pursuing three other pro- 
grams, namely the Bachelor of Arts program, 
Home Economics, and joint majors between edu- 
cation and an academic field, of twenty-seven 
differences studied, twenty-two were in favor of 
the teacher education group (Table II). For the 
men, when compared with the Bachelor of Arts, 
School of Commerce, Engineering, pre-medics 
and the joint major groups, of thirty-five differ- 
ences calculated, nineteen were infavor of the 
teacher education group (Table III). 

The Wisconsin graduates were a special 
groupinother respects. Theywere, for example, 
all first or second year secondary-school teach- 
ers employed by Wisconsin school superintendents 
and teaching, for the most part, in the south 
central part of the state. 

The fact that many of these studies employed 
University of Wisconsin graduates and that these 
are highly selected groups should always be kept 
in mind. Generally speaking, as has already been 
said, the studies to be summarized are correla- 
tional studies, andit is generally recognized that 
restrictions on the range of talent tend to lower 
the size of the coefficient of correlation. Many 
of these teachers will also have met the minimum 
cut-off points for many of the measures applied 





25 


to them and for these we would expect the corre- 
lations to be low and to fluctuate around zero 
where the effects of particular variables have been 
largely eliminated through selection. In examin- 
ing the data it will be difficult to ascertain which 
zeros, or near zeros, arise from selective fac- 
tors and which arise from a no relationship situ- 
ation. A convenient illustration of this point will 
be found in the reported correlations between in- 
telligence and teacher effectiveness. Some would 
interpret the low correlations sometimes found in 
these and other studies as indicating no relation- 
ship between intelligence and teacher effective- 
ness (Table IV). There were many near zero 
correlations between intelligence and teacher ef- 
fectiveness found in the studies of University of 
Wisconsin graduates, but these groups were, as 
already said, highly selected in many respects, 
including intelligence. Underthese circumstances 
many of the correlations between intelligence and 
the criterion scores would be expected to be near 
zero. In absence of convincing data to the con- 
trary, it will probably be best, however, to con- 
clude that teaching requires a reasonably high 
level of intelligence. Possibly a more fruitful 
approach might be made through studies of the 
kinds of intellectual demands placed upon teachers. 

If we may pursue this matter a bit further, 
Stoelting (75), ina study of selective devices em- 
ployed at the University of Wisconsin, found little 
to be gained in a more vigorous selection pro- 
gram than that now pursued. 

He concluded that ifthe minimum grade point 
average were increased from the1.3, em ployed 
at the time of his study, to 1.5, earlier employed, 
the higher minimum would screen out 13 of 24 who 
were judged to be of less than average teaching 
ability but at the sametime eliminate 31 who were 
rated average, 7 above average, and 1 rated su- 
perior. Nemec (58) found that over a ten-year 
period only 16 out of a total of 265 teachers who 
were denied acertificate to teach after a two-year 
probationary period studied were from the Uni- 
versity. Knowledge of subject matter was the 
least frequent source of difficulty and lack of pu- 
pil control the most frequent cause of trouble. 

Another group of studies were made with el- 
ementary teachers. While these groups were al- 
so selected inasense, they were selected ina 
somewhat different sense. They were from many 
different institutions, with a great diversity of 
training and experience, and heterogeneous in this 
sense, but aselectedgroupin that as experienced 
teachers considerable weeding out had already 
taken place through the operation of a great vari- 
ety of forces acting to eliminate teachers. They 
were further chosen, for their particular posi- 
tions, by superintendents that did not knowingly 
employ inferior teachers. Detailed data on the 
characteristics of these teachers will be found in 
the several studies. One group of studies, for 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE I 


MAJOR CHARACTERISTICS OF THE TEACHERS STUDIED 
(As reported by Lins) 





Characteristic 





Henmon-Nelson, Mental Ability College 
Norms (T-Scores) 


Rank, High School Class (T-Scores) 
American Council Psychological (T-Scores) 
Cooperative General Culture (T-Scores) 
Cooperative English Test (T-Scores) 
Cooperative Reading Test (T-Scores) 
Grade Point Average (Four Years) 

Grade Point Average (Junior-Senior Year) 


Practice Teaching (Grade Point) 





TABLE II 


MEAN DIFFERENCES BETWEEN THE TEACHER EDUCATION AND THREE OTHER 
GROUPS FOR WOMEN 





Variable BA Group Home Ec. Joint Majors 





High-School Percentile Rank +3. 76 +3. 09 +o 
ACA Percentile Rank -7. 42 +4, 71 
Earned Grade Point Average . 05 . 26 
Predicted Grade Point Average ; .07 


Earned Grade Point Average 
Nonprofessional Subject j . 32 


Earned Grade Point Average 
English and Speech ‘ me | 


Earned Grade Point Average 
Science ; at 


Earned Grade Point Average 
Foreign Language R Pe 


Earned Grade Point Average 
History and Social Studies 











TABLE III 


MEAN DIFFERENCES BETWEEN THE TEACHER EDUCATION AND FIVE OTHER GROUPS FOR MEN 





Variable Pre-Med. 





High-School Percentile Rank -6. 60 
ACE Percentile -7. 62 


Earned Grade Point Average 
All Subjects : .23 


Predicted Grade Point Average . . SSE 


Earned Grade Point Average 
Nonprofessional Subjects ‘ . 24 


Earned Grade Point Average 
English and Speech . . 09 


Earned Grade Point Average 
Science 








TABLE IV 


CORRELATIONS BETWEEN CERTAIN MEASURES AND FOUR CRITERIA OF 
TEACHER EFFECTIVENESS 





Depart- Placement Practice 
In-service mental Bureau Teaching 
Ratings Ratings Ratings Grades 





Henmon-Nelson Psychological . 314 . 216 .119 . 094 
ACE Psychological -. 027 . 163 : . 026 
Cooperative Reading Test Scores . 056 . 240 . . 059 
High-School Rank . 221 . 205 ‘ 237 


Grade Point Average . 385 . 335 : . 378 








28 JOURNAL OF EXPERIMENTAL EDUCATION 


example, by Johnson (39), Lyon (50), Torgerson 
(78), and Walvoord (83) used as their subjects 
teachers from five Wisconsin cities, namely, An- 
tigo, Lake Mills, Marinette, Marshfield, and 
Stevens Point, cities ranging, at the time these 
studies were made, in population from 2007 to 
13,734. These teachers were doubtlessly a se- 
lected group in the sense here discussed. There 
was also some drawing off of the most competent 
teachers by the larger communities of the state. 

The teachers employed in this particular 
study had an average of slightly more than ten 
years of experience, 35 percent had one to five 
years of experience, 31 percent had six to ten 
years, and 34 percent more than ten years of 
teaching experience. Of the sixty-six teachers 
studied, one had a single year of professional 
training, fifty-one were two-year normal school 
graduates, nine had three years of professional 
training, and five had four years of professional 
training. With respect to the institution from 
which they were graduated, and the amounts of 
their training and experience, they were a more 
diverse group than University of Wisconsin grad- 
uates, but in other respects they were a still se- 
lected group, that is, they were selected in re- 
spects that probably have pertinency to this study. 

Studies by Rostker (66), Rolfe (65), LaDuke 
(45), Von Eschen (81), Gotham (30), Mathews 
(54), and Hellfritzsch (34), used overlapping 
groups of elementary-school teachers. These 
teachers were for the most part seventh- and 
eighth-grade teachers teaching in one and two- 
room rural schools. They were graduated prin- 
cipally from Wisconsin state normal schools most 
frequently with three years of professional edu- 
cation. They ranged from 20 years of age to the 
middle fifties. They hadfrom zero to thirty years 
of teaching experience; and the range for time 
spent in their present position was from zero to 
thirty years. This was probably as heterogeneous 
a group as any investigated, at least in some re- 
spects, and as such should supply sizeable cor- 
relations ifsucharetobehad. The findings from 
these studies appeared to differ from others, 
however, only in details. 

Some of the studies employed out of the state 
subjects. Hampton (33) studied graduates of Iowa 
State Teachers College, and Brookover (12) used 
only rural consolidated township high schools in 
twelve north central Indiana counties. Some of 
the interview and questionnaire studies were based 
upon even more diverse populations. 

In sunjmary, then, while the groups studied 
were, for the most part, highly selected groups, 
taken individually and collectively, there was 
much diversity inthe groups studied, particularly 
in such matters as age, training, experience, in- 
stitutions from which they were graduated, and 
the grades and subjects which they taught. 

One final comment should probably be made 





upon the subjects studied. Most of the studies 
herein summarized are non-sampling studies 
based upon available populations. To some people 
this will seem bad, but the research designs em- 
ployed and earlier described seemed to be ade- 
quate to the pur poses of the investigations. As 
has already been said, in many instances the stud- 
ies involved total populations, that is total for the 
concern of the investigators (this was particularly 
true of the follow-up studies), and, ordinarily, 
where there istime, money, and energy to collect 
essential data with adequate care, total popula- 
tion studies seemtobe preferred. There is much 
confusion on this point. Sampling isa very high- 
powered instrument of research and needs much 
more understanding and use than it gets today, 
particularly when one is not able to study total 
populations. But one observes instances where 
there was apparently the time, money, and energy 
to do acareful population study, but less compre- 
hensive methods were employed nonetheless, pos- 
sibly because some one thought this a better ap- 
proach. Sampling need not always be employed 
regardless ofneedand purpose. The studies here 
reported are almost without exception exploratory 
studies directed toward very specific problems in 
specified situations. For other important pur- 
poses beyond those of the investigators, these 
studies will be inadequate. Presumably, some of 
these investigators will in time define new pur- 
poses and see important uses for sampling tech- 
niques in new investigations with more ambitious 
or inclusive purposes. Some investigators will 
be unhappy about the absence of error formulae, 
but the errors of error formulae are, generally 
speaking, errorsof sampling; as far as this author 
is concerned, heis more disturbed by the number 
of instances in which standard errors and tests of 
statistical significance were inappropriately em- 
ployed. 

The question is sometimes asked: In the ab- 
sence of sampling procedures, how does one gen- 
eralize to future population? The answer is that 
one makes statements about the future on the basis 
of the understanding and information about the 
present and the completeness of the explanation 
that one isableto achieve. All sampling studies, 
as do all population studies, concern themselves 
with the present. If there are not too many 
changes in conditions as one goes from the pres- 
ent to the future, and if one’s information is rea- 
sonably complete and accurate, one may engage 
in prediction with the expectancy that the predic- 
tions may have validity for the future. 

In brief, then, taken as a whole, a very large 
number of experienced and inexperiencedteachers 
were involved in these studies, coming from all 
socio-economic levels, but mostly from middle- 
class families. While bothelementary- and 
secondary - school teachers were among those 
studied, the secondary-school teachers were 





mainly teachers of the academic subjects. While 
there was muchdiversity among the groups stud- 
ied, as groups they werein most instances high- 
ly selected groups. These facts should be kept 
in mind in the interpretation of the data. 


The Data-Gathering Devices Employed in 
the Investigations Here Summarized 








Much attention was given in these investiga- 
tions to instrumentation. A very great variety 
of data-gathering devices were employed in the 





29 


investigations here summarized. As a matter of 
fact, one of the purposes of these investigations 
was to ascertain what progress has been made in 
the identification of important characteristics of 
teachers. If the instruments employed in the in- 
vestigations are publishedor are otherwise avail- 
able, they will be discussed in a succeeding chap- 
ter. Unpublished materials will be found as a 
part of the appendices of the original doctoral 
theses. The instruments varied greatly as to val- 
idity and reliability. The details of these instru- 
ments are discussed in a later chapter. 





JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, September 1961) 


CHAPTER IV 


DATA-GATHERING DEVICES EMPLOYED IN THE WISCONSIN STUDIES 


CLARENCE BEECHER 


This chapter was undertaken to provide in 
convenient form a list of the data-gathering de- 
vices employed inthe Wisconsin Studies of Teach- 
er Effectiveness with information about their va- 
lidity and reliability. These measuring devices 
include tests, rating scales, behavior records, 
check lists, questionnaires, interviews, and 
physical measurements of various sorts. 

A very large amountof effort during the last 
half century has gone into the study of ways of 
evaluating teacher effectiveness, and many data- 
gathering devices of various sorts have been de- 
veloped to assist in the processes of evaluation. 

An important and significant consideration 
underlying the study of teaching effectiveness is 
the development of valid, reliable, and objective 
measures of the teacher characteristics thought 
to be associated with effectiveness. The degree 
to which these measures are valid and reliable, 
and how well they serve the purposes for which 
they were constructed will affect the accuracy 
with which teacher effectiveness may be evaluat- 
ed. 

The data-gathering devices here reported 
were used inrelationto various criteria of teach- 
er effectiveness. Sevencriteriaof teacher effec- 
tiveness were em ployed in the summary of the 
data relative to the various data-gathering de- 
vices here reported. 

These categories were selected on the basis 
of their all-inclusiveness. They comprise those 
used both withteachers in-service and pre-teach- 
ing teacher education programs. The criteria 
were as follows: 

Criteria Abbreviation 
. Inservice Rating ISR 

. By the superintendent ISRst 
. By the principal ISRp 
. By other supervisory officials ISRsy 
. By teacher educators ISRte 
. By departmental personnel 

in areas of specialization ISRas 








f. By state departmental personnel 
g. Self rating 


Il. Peer Rating 
Ill. Pupil Gain Score 
IV. Pupil Rating 
V. Composite of Tests Scores 
from tests thought to meas- 
ure teaching effectiveness 
VI. Practice Teaching Grades 
VII. Combination or composites 
of some or all of the above 
criteria Cc 


Only data-gathering devices used in the Wis- 
consin studies and for whichthere are data in re- 
lation to the criteria will be described. Table I 
indicates the frequency of use of the seven cri- 
teriaemployed in studying the 104 data-gathering 
devices. 

In the materials to follow, data are presented 
relative to 104 of the 182 measures employed in 
the Wisconsin studies, namely those for which 
there are reported correlations with the seven 
criteria of teaching effectiveness already listed. 
Seventy-four of the objective and subjective meas- 
ures correlated . 36 or more with the indicated 
criteria. These are listedwith an asterisk (*) be- 
side the enumerated measure in the left margin. 

The following abbreviations were used in 
Table II: 


Reliability of measure Rel 
Reliability reported in manual, some- 

times inthesis, sometimes in Buros* M 
Reliability obtained from studies Ss 
Reliability not given NG 
Validity of measure Val 
Source of information SI 
Criteria Cri 
Number of cases that measure 

was validated against N 
Elementary division El 
Secondary division Sec 


*Oscar Krisen Buros, Mental Measurement Yearbook (New Brunswick, N. J.: Rutgers University Press). 











ot en Dn EDO DS “4 


68 eI Il | 02 IS 





ust | IpSyst stus] *t4SI ASys] 

















2) DLd 








VIIIID)| Sapes I I I 
‘duro ps ne ag BSuryey ad1Asrag-u] 
‘quUIOD | ‘eid 


(1) (9) (1) 





fm 
ea) 
eo) 
O 
ta 
ea) 
m 





























VIYALINO LNAYAAAIG AHL OL NOLLV TAY 
NI SAOIAMC DNINAHLVD-VLVGC 4O SAdAL LNAUAAAIG AHL AO ASN dO (3) AONANOAYA 


I AIaVL 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE I 


SELECTED LIST OF DATA-GATHERING DEVICES EMPLOYED IN THE WISCONSIN STUDIES 
OF TEACHING EFFECTIVENESS 








Abilities to Organize Research Ma- 
terials, Revised Form, J. W. 
Wrightstone, Bureau of Publications 
Teachers College, Columbia Uni- 
versity , . 31 


Adjustment Inventory, Student Form 
1934, Hugh M. Bell, Stanford Uni- 
versity Press . 93 (S) [ 26. 66 
Average 4 23.0 
Score 34. 33 
. 040 
. 272 
. 141 


. 234 
Part B . 186 


Agricultural Manipulative Skills In- 
formation, Department of Education 
University of Wisconsin 


Almy-Sorenson Rating Scale for 
Teachers, Public School Publishing 
Co. , Bloomington, Illinois 


American Council Civics and Gov- 
ernment Test, Form B, World Book 
Co., Yonkers-on-Hudson, N. Y. . 82 (M) 1.50 to 9.25 
Mean 8.52 
Gain 4.5* 
-. 01 
. 29 to . 40 


American History Test, Form A, 
1939-40 Edition, St ate High School 
Tests for Indiana, S. Provus and V. 
F. Dawald and State High School 
Test Committee j . 229 to .185 


Aptitude Test for Elementary and 
High School Teachers, J. E. Bath- 
hurst, F. B. Knight, G. M. Ruch, 
and F. Telford, Bureau of Public 
Personnel Administration, Washing- 


ton, D. C. - . 046 to . 786 





BEECHER 


TABLE ITI (Continued) 





Rel, 





(No. 7 continued) 


Autobiography based on eight per- 
sonal qualities 


Audio Recordings 


Barr-Harris Teacher Performance 
Record, A.S. BarrandA. E. Harris, 
Dembar Publications, Inc., 1943 


Battery of Alternation Tests on 
Flexibility, R. W. Kleemeier and 
F. J. Dudek, ‘“‘A Factorial Investi- 
gation of Flexibility, ’’ Educational 
and Psychological Measurement, X 
(Spring, 1950) 





a) Two digit numbers, addition 


b) Two digit numbers, subtraction 


c) Two digit numbers, mixed flex- 
ibility 


Battery of Objective Type Measures: 
tempo, fluency, speed, suggesti- 
bility, disposition-rigidity, dexter- 
ity- coordination 


California Test of Mental Maturity, 
Intermediate Series, Grades 7-10, 
E. T. Sullivan, W. W. Clark, and 
E. W. Tiegs, California Test Bur- 
eau, Los Angeles 28 


California Test of Personality, In- 

termediate Series, Grades 7-10, 

Form A, W. W. Clark, E. W. 

Tiegs, and L. P. Thorpe, Califor- 

nia Test Bureau, Los Angeles 28 . 06, . 20 
.10 to .21 
. 05 
oly » Be 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE II (Continued) 





Rel. Val. 





(No. 14 continued) . 00 
. 29 


Confidential X Rating Form, State 


Department of Public Instruction, 
Madison, Wisconsin 


Cooperative General Culture Test, 

American Council on Education, 

College 1930-1951, Form XX, N. J. 

Blair et al, Cooperative Test Divi- 

sion, Educational Testing Service . 034 
. 228 
. 194 
. 583 
. 549 to -.042 
.161to .105 
.133to .054 
. 194to -.038 
.206to .003 


an &&t & & & & @ 


Cooperative Social Studies Test, 
American Council on Education, 
Grades 7, 8, 9, Form R, Agatha 
Townsend and Mary Willis, Cooper- 
ative Testing Service, 15 Amster- 
dam Ave. , New York 


Creative Effort Tests of Disposition 
Rigidity, Raymond B. Cattell, A 
Guide to Mental Testing, University 
of London Press, 1953 


a) Reverse strokes 
b) The Alphabet Tests 


Estimate of Teacher’s Qualification, 
Form 6A Blank, Placement Bureau 
Iowa State Teachers College, Cedar: 
Falls, Iowa 


General Category Rating Form, Iowa 
State Teachers College, Cedar Falls -T1to.91 


(S) 


Giles Recitation Score Card, 

World Book Co., Yonkers-on- Hudson 

New York, 1925 . 89 (S) -. 444 to . 812 
-. 322 to . 289 





BEECHER 


TABLE TI (Continued) 





Rel. Val. 





Grade point average, Freshman- 
Sophomore 


Grade point average, Junior-Senior 


Grade point average, major teach- 
ing field 


Grade point average, overall pro- 
fessional courses 


Grade point average, professional 
course 


Grade point average, professional 
less practice teaching grades 


Grade point average, university 


Grades, practice teaching 


Grades, practice teaching, Educa- 
tion Methods Course 


31. Grades, practice teaching, Educa- 
tion 75 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE [II (Continued) 





Rel. 





Guide Sheet C, Teacher Personnel 
Research Committee, University of 
Wisconsin (mimeographed) 


Guilford- Zimmerman Temperament 


Survey, J. P. Guilford andW. S. 


Zimmerman, Sheridan Supply Co. , 
Beverly Hills, California 


Henmon-Nelson Tests of Mental 
Ability, Forms A, B,C. V.A.C. 
Henmon andM. J. Nelson, Houghton 
Mifflin Co. , New York 


Interviews 
a) Composite interviews 


b) Eight qualities and the composite 
for interviews 


Interview Digest on Eight Personal 
Qualities 


Kuhlman-Anderson Intelligence 
Test, F. Kuhlman and R. Anderson, 
Fourth Edition, Grades VII-VIII, 
Personnel Press, Inc., (formerly 
published by Educational Testing 


Bureau, Educational Publishers, 
1953). 


Link Inventory Activities and Inter- 
ests, 1938 Revision, Grades 7-13, 
Henry C. Link et al, Psychological 
Corporation, New York 


-90 to .97 (M) 
.87 to .94 (M) 


-91 to .95 (M) 


.73 to .88 (M) 





BEECHER 


TABLE II (Continued) 








Michigan Education Association 

Teacher Rating Card, Michigan Ed- 

ucation Association, Lansing, Mich- 

igan ; . 273 
. 770 
. 931 to . 857 
.116 to .161 
. 346 to .114 
. 297, .097 
. 39 
12 to .31 


Minnesota Mu ltiphasic Personality 
Inventory, Revised Edition, S. R. 
Hathaway andJ. C. McKinley, Psy- 
chological Corporation, University 
of Minnesota, 1943 .71to.83(M) -.567 to . 776 


Morris Trait Index L, E. H. Morris, 
Public School Publishing Co., 
Bloomington, Illinois . 824 (S) . 56 
Mean | .18 to 4. 68 
Gain” -.169to . 441 
-. 234 to . 069 
. 357 to .198 
0. 9% 
-.17 
. 03 to . 30 


National Teachers Examination, 
Professional Information Section, 
1951 Edition, Educational Testing 
Service, Princeton, N. J. . . 106 
. 240 
. 248 
. 202 
.199 to . 290 


New Standard Arithmetic Test, Form 
V, TT. L. Kelley, G. M. Ruch and 
L. M. Terman, World Book Co., 
Chicago, Illinois .71to.95 (M) -. 337 to . 326 


Objective Tests of Primary Source 
Traits, Test 38a, Department of Ed- 
ucation, University of Wisconsin, 
mimeographed . 32 to . 28 


- 60 to . 31 


Observation Scale for Rating Citi- 

zenship at School, Vernon Jones, 

Bureau of Educational Research, Un- 

iversity of Wisconsin 111. 28 | Ave. 
139. 38| Rating 


Original Activity Analysis List of 184 
items describing observable teacher 
activities, C. D. Jayne, University 
of Wisconsin -.98 to . 869 


-.41 to . 81 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE ITI (Continued) 





Rel. Val. 





(No. 46 Continued) -. 75 to -.19 


-.66to .86 
-.3lto .42 
P =-.08 to .20 


Orientation Test Concerning Funda- 
mental Aims of Education, A, S. 
Lewerenz and H. C. Steinmetz, Cal- 
ifornia Test Bureau, Los Angeles, 
California . 89 (M) 6. 68 

Mean j{ 1.12 to 15.31 

Gain ) 

-. 06 
. 23 to . 35 


Paired Comparison Rating Scale, 
N. D. Hampton, Iowa State Teach- 
ers College, Cedar Falls .71to.87(S) -.33 to .51 


.00 to .56 


Paired Comparison Rating Scale, (S) 

Reasons for Choice of Teaching as -64to .93 

a Profession, Department of Educa- -40 to .88 . 447 to . 416 
tion, University of Wisconsin . 386 to . 362 


Participator Rating Scale, Depart- 
ment of Education, University of 
Wisconsin (mimeographed) . 524 to . 741 


Peer Evaluation Form, adapted 
from Student Reaction to Instruction 
Questionnaire, RoyC. Bryan, 
Western Michigan College of Educa- 
tion, Kalamazoo, Michigan 


Pennsylvania Teacher Rating Score 
Card, Commonwealth of Pennsyl- 
vania, Department of Public In- 
struction, Harrisburg, Pennsyl- 


vania . 403 to . 882 
. 060 to . 231 


Percentile Rank, High School 


aa + oe | SS 


<_< 


Personal Health Standard Scale, 
T. D. Wood, Bureau of Publications, 
Teachers College, Columbia Uni- 


versity . 508 to . 135 


.176 to . 453 
. 324 to . 461 





BEECHER 


TABLE II (Continued) 





Rel. Val. 





Personality as measured by an in- 
formal ten-point scale for rating 
personality. Drawn from Common- 
wealth Teacher Training Study and 
other sources 


Personality Inventory, R. G. Bern- 
reuter, Stanford University Press 
California 

(Bn) Neurotic Tendency 


(Bs) Self-sufficiency 


(Bd) Dominance-submission 


(Fc) Flanagan self-confidence 
(Fs) Flanagan sociability 


Personality Rating Scale, A. S. Barr 
et al., Department of Education, Un- 
iversity of Wisconsin (mimeo- 
graphed) 


Pressey Diagnostic Reading Test, 
Form A, Grades 3-9 


Professional Information as meas- 
ured by an unpublished test construct- 
ed by T. L. Torgerson, University 
of Wisconsin, 1930 


Psychological Examination for Col- 
lege Freshmen, American Council 
on Education, L. L. Thurstone and 
T. G. Thurstone, Editions 1929, 
1936, 1945, 1948, Cooperative Test 
Division, Educational Testing Ser- 
vice 


-. 006 to . 589 
-. 256 to . 602 
-. 482 to . 674 


.85 to .92 11. 6% 
(M) 65. 74 
Mean / 4. 37to 29.50 
Gain -. 


. 43 to 12. 50 
| 

.01 to .19 

a 
-.20 to .27 
ery 
. 31 to 
. 04 
.09 to. 
. 04 
.16to. 
.29 to. 
.30to. 


. 373 to . 126 
. 387 to . 774 
. 420 to . 653 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE II (Continued) 





(No. 60 continued) 
Gain 


.340to .641 
. 353 to -. 001 
. 099 

. 256 

. 046 

. 243 

. 020 to . 489 
. 425 to . 611 
176, -.124 
.518 to . 312 
.10 

a0 to .38 


Pupil Questionnaire, Wilbur Brook- 
over, Indiana State Teachers Col- 
lege 


Pupil Reaction Instruction, H. M. 
Anderson, Department of Education 
University of Wisconsin 


Ranking Questionnaire, Belief of 

Advantages of Various. Occupations, 

Department of Education, University 

of Wisconsin NG . 450 to . 619 
. 451 to . 448 


Right Conduct Test, H. A. Wood, 
Grades 7-12, Hillsdale School Sup- 
ply Co., Hillsdale, Michigan, 1936. 





BEECHER 


TABLE II (Continued) 








Rudisill Scale for the Measurement 
of the Personality of Elementary 
Teachers, M. Rudisill, University 
of Wisconsin. Unpublished. 


Safeguarding Public Health, C. J. 
Daggart and class in test construc- 
tion, WisconsinState Teachers Col- 
lege, Whitewater 


Scale for Evaluating Personal Fit- 
ness of Teachers, W. W. Charters, 
Ohio State University, Columbus, 
Ohio (mimeographed) 


Scale for measuring the attitude to- 
ward any teacher, Form A, direct- 
ed and edited by H. H. Remmers, 
Division of Educational Reference, 
Purdue University 


Scale for measuring the attitude to- 
ward teachers and teaching profes- 
sion, Tressa C. Yeager, Bureau of 
Publications, Teachers College, 
Columbia University (mimeo- 
graphed) 


Scale for rating certain acoustic fac- 
tors of speech by auditory impres- 
sion - reading. University of Wis- 
consin (mimeographed) 


Scale for rating certain factors of 
speech by auditory impression - 
speaking. University of Wisconsin 
(mimeographed) 


.10 to .17 


- 11 to . 37 


02 to .07 
10, . 48 
28 


. 62 

. 12 to 2.50 
. 198, -. 003 
. 164 

. 154, -. 104 
. 1% 

a 

.03 to . 47 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE II (Continued) 








Scale of Civic Beliefs, Forms A,B, 

J. W. Wrightstone, Institute of 

School Experimentation, Teachers 

College, Columbia University ‘ oT 
. 81 to 16. 43 
. 6% 
. 29 


Scale on Eight Personal Teacher 

Qualities, mean of 8 qualities . 288 to. 
.070 to. 
.046 to. 
. 275 to. 
. 205 to. 
.139 to. 


Schutte Scale for Rating Teachers, 
World Book Co., Yonkers-on-Hud- 
son, New York 


Sims Score Card for Socio-econom- 
ic Status, Form C, V. M. Sims, 
Public School Publishing Co. , 
Bloomington, Illinois 


Sixteen P,. F. Test, Forms A and B, 

Raymond B. Cattell, Institute of 

Personality and Ability Testing, 

Champaign, Illinois .71 to .93(M) 
.50to .88(S) 


a ££ & 4 


ee & &. © @ 


Social Adjustment Inventory, Sapich 
Edition, J. N. Washburne, Syracuse 


University . 92 (M) = 07 


< 


Average <~342. 49 
Score 329. 66 
1. 18(C.R.) 
: (C.R.) 
45 (C.R.) 


oe. me 


376. 80 
386. 6 

353. 33 
Mean Gain 8. 86 
Mean Gain 16.18to26.50 


Average 
Score 


~~ 2 


a &. 4 





TABLE II (Continued) 








(No. 77 continued) 


Social Attitude of Secondary Teach- 
ers, G. W. Hartman, Teachers 
College, Columbia University 
(mimeographed) 


Social Intelligence Test, George 
Washington University Series, 
Grades 9-16 and adults, 1930, I 
Form. F. A. Moss, T. Hunt and K. 


T. Omwake, Center for Psycholog- 
ical Service, George Washington 
University 


Social Proficiency Test, Virgil D. 
Jackson, in ‘‘The Measurement of 
Social Proficiency,’’ Journal of Ex- 
perimental Education, VIII, 1940- 





Special Pronunciation Test on 
Words, University of Wisconsin 
(mimeographed) 


Special Proficiency Test - Speech 
University of Wisconsin 


Stanford Educational Aptitude Test, 
M. B. Jensen, Stanford University 
Press 

T-A 


00, . 24 
44, 04 
.07 to .17 


. 81 
6. 50 to 7. 87 

. 8% 

.01, .38 

. 21 to .52 


. 327 to -. 016 
. 365, -. 354 
.131, -. 0064 


. 94 
. 56 to 47.06 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE II (Continued) 





Val. 





(No. 83 continued) 


A-R 


Strayer-Engelhart Teacher Rating 
Scale, Short Form, C. F. Williams 
and Son, Inc., Albany, New York 


Student Data Booklet, School of Ed- 
ucation, University of Wisconsin, 
Personnel Research Office (mimeo- 
graphed) 


Student-Teacher Social Distance 
Scale, adapted from Classroom So- 
cial Distance Scale. Horace Mann- 
Lincoln Institute of School Experi- 
mentation, Columbia University 


Supervisors’ and Evaluators’ Scale 
and Anecdotal Report of Visitation 
and Conference, Western Dane 
County Curriculum Committee, 
Madison, Wisconsin 


Teachers’ Summary on the Tasks of 
School Educational Attitudes, West- 
ern Dane County Curriculum Com- 
mittee, Madison, Wisconsin 


Teachers’ Summary on the Tasks of 
School Educational Practices, West- 
ern Dane County Curriculum Com- 
mittee, Madison, Wisconsin 


Teacher Judgment Test, Department 
of Education, University of Wiscon- 
sin 


. 08 

.06 to .14 

. 00 

. 37 to 44.62 
.15 to .18 

. 7 

. 37 to 47.12 
ke 

.07 to .09 


. 278 to .512 


. 000 to . 709 
. 207, . 852 

. 201 

. 483 


.14to . 79 








BEECHER 


TABLE II (Continued) 








Teacher Placement Department 
Rating Sheet, Form TPD-S, Indus- 
trial Commission of Wisconsin, 
Madison, Wisconsin 


Teacher Questionnaire, Wilbur 
Brookover, Indiana State Teachers 
College 


Teacher Self-E valuation Form, H. 
M. Anderson, Department of Educa- 
tion, University of Wisconsin 


Teacher -Student Social Distance 
Scale, adapted from Classroom So- 
cial Distance Scale, Horace Mann- 
Lincoln Institute of School of Exper- 
imentation, Columbia University 


Teacher-Teacher Social Distance 
Scale, adapted from Classroom So- 
cial Distance Scale, Horace Mann- 
Lincoln Institute of School of Exper- 
imentation, Columbia University 


Teachers College Psychological Ex- 
amination, State Teachers College, 
St. Cloud, Minnesota 


Teaching Situation Inventory, Uni- 
versity of Wisconsin (mimeo- 
graphed) 


Yule’s Q 


L 


. 52 to . 89 


57 to . 70 
67, . 62 
54 


. 068 to . 166 
. 441 


. 088, . 321 

. 045 

. 133 

. 032 to . 581 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE II (Continued) 








Technical Agricultural Information, 
adaptation of Mastery Tests in Ani- 
mal Husbandry, Interstate, Dan- 
ville, Illinois 


Test of Teaching Problems, T. L. 
Torgerson, Department of Educa- 
tion, University of Wisconsin (mim- 
eographed) .11 to . 44 


Test on Alaska, C. D. Jayne, Uni- 
versity of Wisconsin 


Theory and Practice of Mental Hy- 
giene, T. L. Torgerson, Depart- 
ment of Education, University of 


Wisconsin (mimeographed) . 367, -.029 


. 243, .350 
.27to .54 


Thurstone Tem perament Schedule, 

L. L. Thurstone, Grades 9-16 and 

Adults, 1949-50, Science Research 

Associates, Chicago .61 to .82 (S) 
-48to .77 (S) 
-46to .77 (S) 


i 4&4, 4, 4% 44 & & 


Torgerson Diagnostic Rating Scale 

of Instructional Activities, T. L. 

Torgerson, Public School Publish- .86 to .89 (S) 

ing Co., Bloomington, Illinois .92 (S) . 56 


Mean . 18 to 51.68 

Gain . 509 to . 802 
. 124 to . 302 
. 275 to .103 
.139, .002 
. 404 to .539 
. 43 
-16 to . 34 


4.%,4.4, 4.4 4,464,464 





BEECHER 


TABLE II (Continued) 











Wisconsin Adaptation ofthe M- Blank, 
original M-Blank Data for Individual 
Staff Members, 1940 Edition, Coop- 
erative Study of Secondary School 
Standard ‘ PuR 
ISRp 
PeR 
° ISRsdr 
.61 to . 77 ISRte 
. 80 ISRp 
. 880 Cisr 
. 880 Cisr 
. 831 Cisr 
. 723 Cisr 
. 835 Cisr 





*Mathews (54) represented the calculated percentages as the summary of results of item validation. _ 





JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, September 1961) 


CHAPTER V 


THE USES AND ABUSES OF CORRELATiJNAL AND REGRESSION TECHNIQUES 
IN THE EVALUATION AND PREDICTION OF TEACHER EFFECTIVENESS 


ALLEN ABELL 


The ability to predict the value of one vari- 
able (here, teacher effectiveness) on the basis of 
one or more variables (predictors) depends on 
the relationship or covariance between the vari- 
ables in the underlying population. While the er- 
ror in prediction depends on the degree of covar- 
iance, the more important and often neglected 
aspect is the consideration of the actual but un- 
known degree of covariance in the population in- 
volved. 

If a truly random sample is taken from a well 
defined population, it is possible to establish, on 
the basis of probability, the sampling error of 
the predictive statistic thatisused. On the other 
hand, if the group studied is either a whole cur- 
rent population or a random sample from a whole 
current population, any predictive device can 
merely predict that which has taken place. 

It would seem reasonable for predictive pur- 
poses to define the population as that being con- 
structed year after year in segments by the rel- 
atively stable social forces in operation through 
some period oftime. The extent of the population 
defined would be limited only by the purposes of 
the study and the degree of relationships that ac- 
tually exist in the population. It is possible that 
the most important traits of good teachers might 
range from common qualities of all teachers in 
the nation down to specific qualities of groups of 
teachers working at certain grade-areas in lim- 
ited localities. 

If there is a non-situational nationwide agree- 
ment, which there is not likely to be, as to the 
qualities of a ‘‘good’’ teacher, then this would be 
the starting point for definingthe population to be 
investigated. If this is not the case, perhaps 
one’s definition would start with teachers in re- 
gions, teachers in states, or teachers from spe- 
cific institutions, and possibly be further limited 
to teachers to be employed in specific education- 
al levels and specialized subject areas within ac- 
ceptable geographic sections. 

To further illustrate the defining of a popu- 
lation, let us consider one institution such as a 
state university. Each year a group of individ- 
uals is graduated from the institution, certified 





*Normal approximation without correction for skewness of r dist. S.E. = 





to teach at some subject-area level and gain em- 
ployment within a limited geographic area. It 
does not seem unrealistic to consider this group 
as a segment of a population of teachers that is 
composed of selected people who come from rea- 
sonably common cultural backgrounds, geographic 
areas, and public schools, and who attend the un- 
iversity. The selective factors of academic 
grades, school of education admission requ ire- 
ments, and graduation and certification require- 
ments would give additional commonness to the 
group with already similar cultural backgrounds. 
Hopefully, this commonness could be expanded be- 
yond the doors of one institution. 

If this population has measurable pre-service 
variables that are closely relatedtotheir achieve- 
ment as future teachers, either as teachers in 
any subject field and/or geographic area, or as 
teachers in more specific fields and/or more spe- 
cific localities, andthere are some statistical de- 
vices to handle adequately these variables, the 
problem of prediction should be solvable. [If not, 


Let us next consider the possible nature of 
the distribution of variables in such a population 
and the possible devices for discovering the un- 
known relationships. It seems reasonable to as- 
sume that teachers are aselected group (and there 
is some data to support this assumption for some 
institutions) and that teaching requires particular 
characteristics of individuals; it would still 
seem reasonable to assume that teaching ability 
is a nearly normally distributed variable and the 
possible pre-service covariables also would ap- 
proach normality inthis selected population. If this 
seems acceptable, let us look at two hypothetical 
extremes that one might find in a ‘‘sample’’ from 
this multivariate normal distribution by consider - 
ing teaching ability and one possible covariate (see 
top of next page). Figure A and Figure B illus- 
trate how misleading a lowcorrelation and a high 
correlation, respectively, between two variables 
ina sample might be in light of the underlying 
population relationship. To further emphasize 
this point, let us look at Table I of approximate* 
confidence intervals for a Pearson correlation 





FIGURE A 


non wo os 


— 


(if a covariate) 





Grade Point Average 





Principal’s M- Blank 


TABLE I 








95% Confidence Interval 





. 00 -. 28 to +. 28 


. 20 -. 08 to +. 48 





. 30 | +. 04 to +. 56 





coefficient for a sample of fifty cases if the sam- 
ple were arandom sample from the population 
instead of a time sequence segment of the popu- 
lation. Now it is possible that atime sequence 
sample could have either wider or narrower con- 
fidence intervals like those given above, hopefully 
the latter if proper but still useful population 
limits can be established. 

In light of these and other inadequacies of the 
correlation coefficient or related covariance 
measures in a time sequence sample (rather than 
random samples), the question arises as to what 
course of action is possible to ferret out the re- 
lationships which are hopefully present. The an- 
swer, of course, lies in the unk nown population 
covariances and whether these exist in substan- 
tial magnitude from the broadest nationwide, 
statewide, or citywide teacher evaluation down to 
the teacher graduates of specific institutions 
teaching a specific subject at a specific level. If 
one is willing to agree that teaching ability is 
predictable or in other words one is willingto a- 
gree that the selective factors of teachers are 
relatively stable in our society over a period of 
time, then the process of cross validation and 
restructuring of predictive statistics with se- 
quential gains in information seems the best 
route, whether one does itovera period of years 
at one institution or one takes a broader popula- 





FIGURE B 


Grade Point Average 
(not a covariate) 





? 


ted 3. 4 5 
Principal’s M- Blank 


m 





tion definition and attempts to discover underly- 
ing relationships on a shorter period of time by 
accepting uniformity in judgment of teaching 
quality over a large geographic area. 

The main point here is that a correlation co- 
efficient or related measures of covariance are 
extremely useful, but inthe context of prediction 
of teacher ability they must be used with extreme 
care or they can be very misleading. The abil- 
ities and achievements studied in the investiga- 
tions here summarized are of a selected group 
and consequently limited in variance which con- 
sequently may reduce the covariance inthe un- 
derlying population. If relationships do exist in 
the underlying populations of which these studies 
are a segment and the relationships are supple- 
mentary, then these carefully conducted studies 
should provide us with a start in the search for 
the population relationships that when found will 
in a composite form provide us with adequate 
predictions. 


Related Considerations 





The remaining portion ofthis chapter will deal 
with a new look at a few of the studies here sum- 
marized with an eye to possible improvements in 
data handling. But before this, it is necessary 
to clarify some considerations about tests which 
are employed. 

The greatest problem in predicting teacher ef- 
fectiveness lies in the definition and evaluationof 
teacher effectiveness. In our society we not only 
have to judge quality of teaching in light of the o- 
pinions of psychological, philosophical, and so- 
ciological experts, but also in light of what society 
wants as a product of its schools. First there is 
need of a realistic definition of performances that 
achieve the desired goals, and second, we need 
accurate and unbiased means of measuring these 
performances. 





JOURNAL OF EXPERIMENTAL EDUCATION 


In the studies here reviewed, both pupil gain 
as measured by standardized tests and supervi- 
sory ratings orientated around the Wisconsin 
Adaptation of the Principals M-Blank are used. 
These criteria are considered because both types 
of evaluations have been used in the past and not 
because they are accepted as being adequate in 
these studies. 

The problem of finding pre-service covari- 
ates that exist in the population would seem to be 
a lesser problem once a realistic and adequate 
definition of the qualities expected in good teach- 
ers have been determined. Objective and sub- 
jective devices that measureor purport to meas- 
ure qualities and achievements of people canbe 
tried and improved or discarded as they have been 
in the past. Hand in hand with this process will 
go various statistical treatments of the covariates 
or apparent covariates. Perhaps the additive as- 
sumptions which are made inconnection with 
most predictive statistics may have to be recon- 
sidered, and possibly other approaches might be 
employed. One might, for example, define the 
lack of teaching ability as the absence of one 
quality or being very low in a particular quality 
no matter how highthe person measures on other 
qualities related to success. Perhaps a non- 
additive statistic will evolve that will more ade- 
quately represent the population relationships. 
This possibility will be explored ina later chapter. 


A Review of Selected Studies 





The first approach made by the writer was 
to examine the correlations from study to study 
to get an idea as to the relationships that might 
exist in an undefined but possible time sequence 
population. It was found that there was so much 
variation in the types of pre-service and co-ser- 
vice measures used as well as some variation in 
the evaluation criteria that this approach could 
not be pursued extensively. 

Two studies were found, however, that were 
sufficiently parallel towarranta contrast of cor- 
relation coefficients for consistency and range of 
magnitude. These were studies by Rostker (66) 
and Rolfe (65) which were carefully done in the 
late thirties and inwhichthe students were under 
the influence of only one teacher. 

Rostker’s study involved seventh- and eighth- 
grade citizenship classes in small non-depart- 
mentalized graded schools. Rolfe’s study in- 
volved seventh- and eighth-grade citizenship 
classes of one- and two-room rural schools. 
While there is a difference in the type of school, 








both studies covered comparable geographic 
areas, both groups of teachers had similar edu- 
cational backgrounds and teaching experiences, 
and both groups had similarly prescribed units of 
instruction in citizenship that were evaluated for 
residual pupil gain. This gain involved the sta- 
tistical control of uncontrolled variables such as 
I.Q., reading, socio-economic strata, and initial 
test scores. The standardized evaluation tests 
dealt with areas of health, community planning, 
abilities to organize research materials, civic 
beliefs, applying generalizations, civic attitudes, 
civic information, and civic action. The corre- 
lations with residual gain as the criterion are 
shown in Table II. 

The lack of close agreement in correlations 
could be due to many different reasons, but rather 
than speculate as to this it was decided to take 
the variables with the closest agreement and 
larger magnitudes and contrast the extreme cases. 
While this may atfirst appear to be an overfitting 
of data, it was adopted under the hypothesis that 
the extreme cases would magnify the differences 
between good and poor teachers and help compen- 
sate for the errors of measuring devices. In ad- 
dition to this it would seem more valuable from 
a practical point of view to be able to predicta 
likely high degree of success or failure rather 
than likely average success. 

Three devices were used as a basis for com- 
parison. Multiple regression was used as a stand- 
ard for comparison because of its wide use in pre- 
diction. Fisher’s Discriminant Function and a 
combination of Wherry’s Multiple-Bi-Serial and 
wide-spread Biserial* were used onthe extremes. 

In all three devices the best predicted score 
is based on a linear combination of weighted pre- 
dictors which assumes the quality of additivity. . . 
Z = W1Z1 + W2Z2 + W3Z3... The Z is an indi- 
vidual’s predicted score, the Zj are the individ- 
ual’s covariable or predictor scores and the Wj 
are the weights arrived at by one of the proces- 
ses below. 

Judgment of how well the particular statisti- 
cal function fits the apparent ‘‘sample’’ relation- 
ships (not the population relationships) is made 
on the basis of how close the individual’s predict- 
ed scores approximate the observed scores. 

The Multiple Regression employed is based 
on the hyperplane of best fit for the observed data, 
or ina two-dimensional system the line that 
passes through the scatter diagram of two corre- 
lated variables in such a manner that the sum of 
the squared deviations between the predicted points 
(on regression line) and the observed points is a 


*Robert J. Wherry, ‘‘Multiple Bi-Serial and Multiple Point Bi-Serial Correlation,’’ Psychometrika 
(1947), pp. 189-95. 





Charles C. Peters and Walter R. Van Voorhis, Statistical Procedures and Their Mathematical Bases, 
(New York: McGraw-Hill Book Co. , 1940), pp. 384-91. 








TABLE II 





Co-service Teacher Variables Rostker 





Intelligence (A.C. ) ~ 57 
Intelligence (T.C.) . 40 
Social Adjustment (W. ) 
Social Attitude (H. ) . 52 
Aims of Education (L and S) . 30 
Leadership (M. ) . 20 
Bernreuter Dominance (Bd. ) 2d 
Teacher Knowledge of Mental Hygiene . 45 
Stanford Aptitude (A. R. ) . 04 
Civic and Government Test (A.C. ) . 36 


Professional Attitude (Y) ~ 45 





TABLE III 








Rostker Regression .27Zj , : .45Z4 4025 


W.S. Biserial .41Z) . ‘ . 11Z4 - 4425 


Discrim. Function .51Z) . , .97Z4 .23Z5 





Regression .17Z) . ‘ .26Z4+ .23Z5 
W.S. Biserial . 2921 ‘ .59Z4 + 1.2825 


Discrim. Function 422) . P .58Z4 + 1.5825 














52 JOURNAL OF EXPERIMENTAL EDUCATION 


minimum or ={ Z(obser ved) - Z(predicted)]|* is 
a minimum. In matrix notation the Beta Weights 
which are the coefficients for the ‘‘plane’’ of best 
fit are derived by B = R-lr, where R™! is the in- 
verse of the intercorrelation matrix of co-variates 
and r is a columnof correlations between the cri- 
terion and the co-variates. 

The discriminant function arrives at weights 
for the co-variates by maximizing the ratio be- 
tween the mean differences between the two groups 
and the square root of their pooled within vari- 
ance, the purpose of which is to weight the pre- 
dictors insucha manner that the predicted scores 
form two groups, one high and the other low so 
that there is a large difference between the means 
of the composite scores by groups and a small 
overlap to the two distributions. In matrix rota- 
tion the weights are obtained by W = §-!d, where 
S-1 is the inverse of the pooled within group sum 
of squares, and cross products matrix and d is 
the column of mean differences betweenthe groups 
of predictors. 

The multiple-wide-spread-biserial is quite 
similar to the discriminant function in that it 
maximizes the ratio between the mean difference 
between composite scores to the square root of 
the variance of the whole distribution. The chief 
difference is that one uses the pooled within group 
variance and the other uses the variance of the 
whole distribution of composite scores even 
though the means are based on two partial seg- 
ments of the whole distribution. If the tests are 
in Z score form, the matrix formula for the 
weights is W = R™ lq where Rl is the inverse of 
the intercorrelation matrix of predictors based 
on the complete distributions and d isthe column 
of mean differences in Z scores between the pre- 
dictor groups. 

To illustrate the role of these predictive de- 
vices with a simplified example, consider Figures 
C-la through C-3. The basic equation Z (com- 
posite) = W1(I.Q.) + W2(Grade Point Average) 
determines a composite score for each individu- 
al that is related to his evaluation score (here 
M- Blank) by some mathematical criterion. The 
mathematical criterion determines the weights 
to be usedonthe predictors. In general, the size 
of the weights reflects both the relationship be- 
tween the predictors and the evaluation criterion 
and the inter-relationship of the two predictors; 
that is, if the predictors both measure the same 
human quality, one adds no new information when 
combined with the other. 

In the case of regression (Figure C-1la), the 
score of each individual for the three variables 
(M-Blank, I.Q., and Grade Point Average) give 
the coordinates of a point in three spaces. The 
points for allthe individuals tendto form a ‘‘foot- 
ball’’ shaped configuration. 

The object in multiple regression is to find 
the plane that passes through this configuration in 





FIGURE C-la 


M- Blank 
Scores 


A 





such a manner that the sum of the squared devia- 
tions of the observed scores from the predicted 
scores (on the plane)isaminimum. Figure C-1b 
shows one of the many distances to be squared 
and summed. Zo is one observed score (here 
above the prediction plane) and Z is the predicted 
score on the plane, for a person with the partic- 
ular G. P. A. and I.Q. score that corresponds to 
the ZO observed score. If the relationship was 
perfect, that is, if you could predict M- Blank 
rating perfectly from G. P. A. and I.Q. then the 
configuration of C-la would collapse into a flat 
oval shape and all the observed scores would lie 
on the prediction plane. The degree of departure 
from this ideal serves as the basis for judging the 
adequacy of the regression techniques for func- 
tional representation of the observed sample re- 
lationship. — 


FIGURE C-1b 


M- Blank 
Scores 


Prediction Surface 
(plane) 


I. Q. 


In the cases of the discriminant function and 
the multiple wide-spread biserial, the mathemat- 
ical criterion is not the closeness of agreement 
between the observed and the predicted scores as 
in the case of multiple regression. The same 
basic equation is used [ Z(composite) = W1(I.Q. ) 





+ W2 (Grade Point Average)| in the discriminant 
function, but in this case the individual compos- 
ite scores are derived by weighting the predictors 
so that the distributions of composite scores for 
the two groups, the very successful and the un- 
successfulteachers, will be distinct as possible. 
The means of the two groups of composite scores 
should be as far apart as possible and yet main- 
tain a small overlap of scores (Figure C-2). The 
purpose is to distinguish as accurately as possi- 
ble between likely good or poor teachers on the 
basis of the predictors. 


FIGURE C-2 


Frequency 





Z> 
Composite Scores 


The multiple-wide-spread-biserial is simi- 
lar to the discriminant function but is based on 
maximizing the wide-spread biserial correlation 
of the composite scores with teacher evaluation 
scores. The process determines weights that 
give the high and low group composite means a 
large difference without making the whole distri- 
bution of com posite scores of excessive width 
(Figure C-3). 


FIGURE C-3 


Frequency 








Z. 
Composite Scores 


It must be em phas ized here that predicted 
scores referred to in the above explanations are 





ABELL 53 


‘**predicted’’ in the sense that the statistical de- 
vice employed produces them as an approximation 
to the observed scores and not as a true predic- 
tion. They are calculatedfor the purpose of eval- 
uation of the predictive function. The true predic- 
tion comes when you use the prediction equations 
to estimate the future scores of a parallel popula- 
tion on the basis of the predictors. 

In the Rostker-Rolfe comparison, the follow- 
ing predictors were used because of their fair de- 
gree of consistency: Z intelligence (T.C.), Z92 
Bernreuter Dominance (Bd), Z3 Social Attitude, 
Z4 Knowledge of Mental Hygiene, and Z5 Profes- 
sional Attitude. The high-low groups were es- 
tablished at Z scores onthe criterion of +. 80 and 
-. 86 respectively. In Rostker’s study this kept 
9 high and 6 low out of a total sample of 24, and 
in Rolfe’s study 11 high and 11 low out of 57 were 
kept. The regression weights are scaled auto- 
matically by their definition but the discriminant 
function and W. §S. biserial were rescaled with 
a constant multiplier to make them agree ‘‘on the 
average’’ with groupcriterion averages for com- 
parison purposes. The prediction equations are 
shown in Table III. 

The processes involved in selecting the 
weights takes into account the inter-relationships 
between variables that are not readily seen in 
glancing at a correlation matrix, but, judging by 
the lack of agreement between the weights in two 
studies it would seem that either there was some 
difference in the execution of the two studies or 
the level of covariances of the criteria and covar- 
iables here included are low inthe underlying 
population to which these two groups of teachers 
might conceivably belong. 

Looking at Table IV, one sees how the three 
devices fare as to proper group placement of the 
individuals in the specific studies. 

Although there is the danger of overfitting of 
data and little evidence of underlying population 
relationships, it appears that the investigation of 
extreme cases would be worthy of future consid- 
eration. 

Both the discriminant function and the M.W 
S. biserial come closer to proper group place- 
ment of the extreme cases than the regression 
technique. 

Another interesting observation inthe rescal- 
ing of weights is that in both studies the high and 
low groups would have required a different con- 
stant to achieve optimum agreement with the cri- 
terion group averages when the predictor group 
averages were used inthe rescaling of the weights. 
This would suggest the possibility of different co- 
variates for success and failure and the possibil- 
ity of being ableto predict likely success or fail- 
ure but not simultaneously withthe same variables. 
Although these are based on very few cases, the 
contrasts are very suggestive and would seem 
worthy of future investigations. 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE [V 





Proportion Outside of 
Proper Group 


Mean Deviationfrom 
Proper Group 





Multiple Correlation 








Rostker Rolfe 


Rostker Rolfe 


Rostker | Rolfe 





Regression 


W. S. Biserial 


45 ane 





.18 . 36 


. 26 . 56 


.18 . 41 


. 16 . 50 


. 78 


53 
Discrim. Function . 36 ae 14 .39 














) 
| 


| 
» 








TABLE V 





Rostker 
Low 











Intelligence (T.C.) . 67 
Bernreuter (Bd. ) -. 87 
Social Attitudes (H. ) .97 
Knowledge of Mental Hygiene . 70 


Professional Attitude (Y) .19 





Number 








TABLE VI 





Freshmen-Sophomore G. P. A. 
Education G. P. A. 

Practice Teaching Grade 
Initiative 


Work Habits 





TABLE VII 





Proportion Outside Mean Deviation from 
of Proper Group Proper Group (Z Score) 





Regression . 76 . 65 


W. S. Biserial . 48 . 56 


Discrim. Function . 99 .50 





Number 17 











JOURNAL OF EXPERIMENTAL EDUCATION 


jusujsn{py 
[e190g 


ATILIU] 


yee 


AyTIGV 
€L 


Lins’ Data 


uoTJeN[eA gq 
tidndg 


S}1qeH 
yIONM 


dATIVTPIU] 


AdUSTOII 


Iayoea | 
peIpaid 


> 
ea) 
S| 
fQ 
< 
be 


WwW dS 
uoTjyeonpy 


a 
| ‘ydog-"ysaiag 


[elzasig 
SM 


4UeTE - IN 


Prediction 
Prediction 





The Lins-Von Haden Studies 





As was pointed out earlier, there was con- 
siderable lack of parallelism between the studies 
so that an extensive contrast of correlations was 
impossible, and thus a comparison of predictive 
statistics evolved. It was decided to carry this 
comparison to the supervisory rating criteria 
which was part of the Lins (49) and Von Haden (82) 
studies. Both dealt with the same group of 58 
teachers and together included a large number of 
variables that are possibly co-related to the cri- 
teria of the Wisconsin Adaptation of the Princi- 
pals M-Blank; the Lins study used mostly objec- 
tive data while the Von Haden study was primar- 
ily subjective personal data. 

The group of teachers in the Lins and Von 
Haden studies consisted of 58 women teachers 
who were graduated from the University of Wis- 
consin in 1943. They were evaluated during the 
1943-44 school year while in their first year of 
service as teachers. Therewere teachers of the 
following subjects: Art-5, English-16, History 
and Social Studies-5, Home Economics -23, 
music-4, miscellaneous-5. 

The variables considered here from the Lins- 
Von Haden studies were chosen on the basis of 
their maintainance of afair degree of correlation 
with M-Blank ratings in a follow-up study by 
Briggs (14). The zero order correlations with 
the criterion for the Lins-Von Haden studies are 
shown in Table VI. 

The three statistical devices mentioned 
earlier were used to derive prediction equations 
and predictedscores. Table VII gives the result- 
ing contrast between the predicted scores and the 
observed scores. 

Again it was noted that the scaling of the M. 
W.S. Biserial weights and the discriminant func- 
tion weights would have been more optimal for 
fitting the data if it could have been done by groups 
instead of on the average for both groups. 

A study of the other possible predictor vari- 
ables for the individuals that appeared least pre- 
dictable with the W. S. Biserial predictor yield- 
ed the results shown in Table VIII. Someofthese 
additional variables show promise, but the num- 
ber of cases contrasted here is very small so it 
can serve only as a suggestion for possible sup- 
plementary variables to be investigated. It is al- 
so interesting to note that allthose who were pre- 
dicted unusually high by the W. S. Biserial equa- 
tion were rated higher inthe follow-up study by 





Briggs ten years later. The discrepancies in 
predictions might also be due to the possibility of 
too broad a classification of teacher skills when 
so many specific teaching areas are combined. 
Again, it remains to be seen when one begins to 
uncover the relationships in the broadest popula- 
tion category he wishes to investigate. 


Conclusions 

It should be re-emphasized that the statisti- 
cal treatment of the data in these two groups of 
studies does not imply that the relationships found 
in these studies are considered as being relation- 
ships of some population as earlier defined. The 
investigation of the extremes was done merely to 
explore the possibility ofthis approach as an im- 
proved means of gaining insight into population 
relationships. Alsofive predictor variables were 
used instead of more—merely for computational 
purposes. 

The Wide-Spread Biserial method and the 
Discriminant Function method are fairly compar- 
able in theory but the W. S. Biserial method is 
easier com putation-wise when correlation and 
standard deviation are available. 

It appears that much attention should be given 
to the establishment of commonly accepted bases 
for judging teacher qualities in the population un- 
der concern and that devices for measuring these 
qualities should be developedor improved. There 
is considerable evidence in these studies that the 
evaluation measures used lacked extensive reli- 
ability and validity. 

Given, if possible, these improvements, the 
treatment of data gathered might well consider 
the following points: 


1. Study of the extreme cases for adequate con- 
trasts. 

2. Consideration of the predictability of success 
or failure, separately. 

3. Non-additive predictive devices that would ac- 
count for failure when one quality is low and 
the rest high, if this proves to be the case. 

. Canonical correlation that makes use of a com- 
posite of teacher evaluation traits correlated 
with predictors. 

. Acareful follow-up study of ‘‘unusual’’ spe- 
cific cases that seem unpredictable. 





JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, September 1961) 


CHAPTER VI 


FACTOR ANALYSES OF THE TEACHING COMPLEX 


JOHN SCHMID * 


Teaching is a complex activity carried on in 
a complex environment--the school. It is di- 
rected by complex organisms--human beings. The 
recipients of the teaching activity are complex in- 
dividuals, students, whose characteristics are 
undergoing continuous and complex change. Ifwe 
consider teaching as a system of actions of an a- 
gent, the teacher, intended to bring about learn- 
ing in the students, then teaching effectiveness is 
a generic term relating to the teaching pole of the 
teaching-learning process. That some teachers 
are better than others is unquestioned; but the i- 
dentification of those elements in the teacher or 
the teaching activity which either characterize or 
are determinants of this ‘‘betterness’’ is obscured 
by the realities of the teaching situation and the 
semantic problems inherent indescribing this sit- 
uation. 

It is difficult to define teaching effectiveness 
because the elements in effective teaching appar- 
ently are not only legion but also are intricately 
interwoven. The difficulty is further compounded 
by the use of different names or word symbols 
for the same referent-element. For example, a 
teacher must have sufficient intelligence to per - 
form his job effectively. But this characteristic 
might also be called brightness, aptitude, ability, 
proficiency, etc. Althoughthe referent-charac- 
teristic may be the same, theuse of different 
verbal symbols for this characteristic engenders 
semantic and measurement problems that confuse 
understanding of its nature. 

The elements of the teaching-effectiveness 
complex arbitrarily may be divided into three 
groups: personal characteristics of the teacher, 
external activities in the classroom induced by 
the teacher, and behavioral changes in the stu- 
dents attributable to the teacher. 

It is almost impossible to separate elements 
into those which are specific facets of teaching- 
effectiveness complex and those which merely are 
correlative or concomitant characteristics 
attendant to it. For example, consider the teach- 
er’s knowledge of his students. It seems reason- 
able that this characteristic is part of the com- 
plex. But does a teacher who acquires insight in- 
to and intimate knowledge about his student’s 


*All footnotes will be found at end of this chapter. 








academic abilities, interests, incentives, home 
backgrounds, and personalities do a better jobof 
teaching because of this knowledge or is this 
merely a correlative characteristic af the good 
teacher in the sense that a good teacher will 
search out such knowledge of his students. 

Over the past several decades, many studies 
have been carried out on teachers and related 
aspects of the teaching complex. Personal char- 
acteristics of teachers have been assessed with 
objective psychometric instruments as well as 
with ratings by superintendents, principals, 
peer-teachers, and students. T he environment 
of the classroom has been examined with view for 
assessing the teacher while working on the job. 
Also, teachers have been assessed interms of 
the changes in behavior of their students. Ingen- 
eral, this latter method of evaluating teacher per- 
formance has been accomplished by measur ing 
the students’ educational development at two dif- 
ferent times while they were under the direction 
of the teacher. In view of these many studies, it 
seems reasonable that the time has been reached 
when we should stop and take inventory of some 
of the data and findings which have been acquired 
over these decades. 


Procedure of This Study 





Because of the multiplicity of verbal ap- 
pellations for elements in the teaching-effective- 
ness complex, it is believed that the domain of 
the elements and their correlates shouldbe map- 
ped out by means of correlationanalysis to deter- 
mine if the profusion of elements is real or is 
mere verbal redundancy. With factor analysis 
techniques we have a Statistical toolfor deter- 
mining if the apparently many diverse ele ments 
of the teaching complex are really distinct or if 
they are only semantic variations of fewer read- 
ily identifiable components. 


In some of the studies to be reported, factor 
analyses had already been performedand reported. 
However, because findings may be influenced by 
the factorial technique used, allof the studies ex- 
amined have been re-factored employing the same 
method and criteria. 





SCHMID 59 


In the historical development of factor anal- 
ysis, many factor analytic techniques have been 
developed. The application of these many tech- 
niques to the same data need not give the same 
results because different assumptions are im- 
plicit in each technique. During this develop- 
mental period, the investigator frequently 
was limited to less mathematically elegant factor 
analytic techniques because of the unavailability 
of the necessary high-speed computors which we 
have today. With the development of modern e- 
lectronic computors and because we have todaya 
more advanced statistical theory, we are now in 
position to re-examine some data of past studies 
to determine if a clearer picture evolves of the 
nature of teaching-effectiveness complex. With 
these points in mind, data from several studies 
has been factor analyzed by the normal var imax 
procedure developed by Kaiser (1958). 

The normal varimax solution is signifi- 
cant because it is invariant under changes in the 
composition of the matrix of the variables being 
analyzed. Prior factor techniques had not ob- 
tained this property. 

What does this property of invariance mean? 
It means that if a set of variables are eliminated 
from a correlation matrix, the factor loadings for 
the remaining variables would be the same asif 
the set of variables had not been eliminated. 
Stated in another way, it means: if we are given 
an infinite domain of variables and we want to 
infer the internal structure of this domain on the 
basis of a sample of variables from the domain; 
then, to the extent that the factor is invariant un- 
der various samples of the variables, there is 
evidence that inferenges about the domain factors 
are correct. Kaiser’ offers a powerful demon- 
stration of the invariancy of the normal varimax 
solution applied to sets of variables of Holzinger 
and Harman twenty-four variable empirical data. 
In addition to possessing this highly desirable 
property, the normal varimax solution apparent- 
ly tends to approximate orthogonal simple struc- 
ture which facilitates interpretation of the factors. 

The normal varimax solution is not obtained 
directly from a correlation matrix. Itis obtained 
by rotating other types of factor solutions to the 
varimax form. Consequently, in performing the 
following factor analyses, it was necessary tode- 
cide the method by whichthe correlation matrices 
would be factored to lesser rank. Two commonly 
used factoring techniques are the centroid and the 
principal axis methods. Before the advent of 
high-speed computors, the centroid method was 
used most prevalently because of its greater sim- 
plicity in computation. However, it has been long 
recognized that the principal axis methodology is 
mathematically superior. In the following analy- 
ses, principal axes factors were obtained and 
these rotated to the varimax form. 

Another problem that has troubled the factor 





analyst is the number of factors to extract. Acri- 
terion suggested by Kaiser? was used. All fac- 
tors whose latent roots were greater than unity 
were extracted from the correlation matrices. 
Furthermore, unities were used in the diagonal 
elements. This criterion for the number of com- 
mon factors was accepted on the basis of two 
considerations. Guttman? demonstrated that the 
number of common factors in a domain must be 
at least as large as the number of latent roots 
greater than unity of the correlation matrix with 
the ones in the diagonal. Secondly, Kaiser has 
found that the generalized Kuder-Richardson re- 
liability of a principal component willbe positive 
when the associated latent root is greater than 
one. Guttman’s theorem suggests that if latent 
roots greater than unity are used as a criterion, 

too few common factors may be extracted f rom 
the correlation matrix. However, this weakness 
is offset by Kaiser’s findings regarding the gen- 

eralized Kuder-Richardson reliability of a prin- 

cipal component. Although, using this latent root 

criterion, it is possible that too few common fac- 
tors may be extracted fromacorrelation matrix. 

Kaiser points out that an analytically rotated 

structure in factor analysis is less sensitive to 

too few factors than to too many factors. 

With all of these conSiderations in mind, 
factor analyses were made of reported teacher 
education studies in which were found commonly 
accepted criteria of teaching efficiency: change 
in pupil behavior and ratings of teaching effec- 
tiveness. It is not feasible to present the entire 
factor solution for studies yielding many factors. 
For these cases, it was decided to report in tab- 
ular form only factors which had significant 
(larger than 0. 30) loadings for the criterion var- 
ables. Furthermore, for these factors only var- 
iables having loadings larger than0. 30 were con- 
sidered as practically significant. The first study 
reports factors for other than the criterion var- 
iables to show the general nature of the teacher 
variable domain. It is suggested that the reader 
refer to the original articles for information 
about variables involved in the studies but which 
are not included in this exposition because of 
their factorial inconsequence. The code numbers 
used by the authors of the original articles are 
used here to enable the reader to correlate thea- 
nalyses with the data of the articles. 


Factor Analysis of Rostker’s Data (66) 





Rostker’s central problem was the examina- 
tion of relationships between selected teacher 
measures and learnings acquired by their pupils. 
To overcome the criticism that pupil changes are 
the result of efforts of a number of teachers, he 
used non-departmentalized schools in order that 
any found changes in a pupil could be attributed to 
a single teacher. 





JOURNAL OF EXPERIMENTAL EDUCATION 


Data were obtained for 24 teachers and 375 
pupils from rural and village schools, grades 7 
and 8. The teachers were asked to direct their 
instruction toward two unit topics: ‘‘Safeguarding 
Public Health’’ and ‘‘Community Planning. ’’ Spe- 
cific course objectives were set upand a time 
table was followed. Thepupils were pre- and 
post-tested for informational gainat the beginning 
and end of each three-week unit. The twotests 
used were: Unit I - Health; Unit Il - Community 
Planning. Furthermore, the pupils were tested 
at the beginning of the school year and at the end 
of the school year with two batteries: the Wright- 
stone battery which measured long-term non- 
informational gain and the Hill battery which 
measured long-term informational gain. 

Twenty-eight measures were obtained for 
each teacher. The teachers’ knowledge of subject 
matter was obtained by their taking the two unit 
tests and a test from the Wrightstone battery, 
Abilities to Organize Research Materials, at the 
beginning of the year. Other teacher measures 
obtained were: intelligence, personality, at- 
titudes, and ratings. 

Intercorrelations of 35 variables were fur- 
nished by Rostker in his Table XXVII. Twenty- 
seven of these variables were teacher measures 
and the other eight were measures of pupil change 
(pupil gain). 





The Criterion Factors 





Ten varimax factors are found to describe 
the 35 variable domain. Twoofthesefactors are 
defined by pupil gain variables and one is defined 
by rating of teaching efficiency. These three fac- 
tors are shown in Table Il(Shown at thetop of 
the next page). In general, descriptive names of 
the nature of the variables will be used. How- 
ever, the actual test names will be used for well 
known tests. 

C-1, C-2, and C-3 are raw score pupil gain 
of the combined Unit tests, the combined Wright- 
stone tests, and the combined Hill tests respec- 
tively. C-4 is the raw score pupil gain from the 
compound of these three batteries. C-5, C-6, 
C-7, C-8 are these same measures except that 
the post-test scores have been adjusted re- 
gression-wise for differences in mental age, 
1.Q., reading ability, socio-economic status, and 
pre-test scores of the pupils. 

Factors I and II unquestionably reflect change 
in the pupils’ inteliective behavior. Although the 
distinction between these two factors is far from 
clear, it seems that a non-informational type of 
pupil gain prevails in Factor I. Assuming this to 
be a non-informational change factor, let us con- 
sider some of the variables. It seems that the 
intelligence of the teacher as defined by T-2 
is the highest single variable conditioning teach- 
ing ability. Social attitudes (T-3), attitudes 





toward teaching (T-4), knowledge of subject mat- 
ter (T-7), ability to organize research materials 
(T-1), and the ability to recognize, diagnose, and 
correct pupil maladjustment (T-5) also bear re- 
lationship to this factor. The negative loading for 
attitudes toward teaching, however, is puzzling. 
On the other hand, ratings of the teacher and the 
teacher’s personality as measured by such well 
known instruments as the Bernreuter Personality 
Inventory and the Washburne Social Adjustment 

Inventory seem to have little or no relationship 
to this factor. Factor I contains those variables 
which Rostker found by multiple regression to be 
the significant predictors of the criterion. 

The existence of Factor II, in addition to 
Factor I, indicates the non-unitary nature of pu- 
pil learnings. The highest loadings on this factor 
are for variables all of which represent pupil gain 
in information. It is the sharpness ofthis factor 
which helps suggest that Factor I is of a non-in- 
formational character and that the pupil change 
variables involving informational elements in 
Factor I are present because they also involve 
non-informational elements. It is of interest to 
note that for this more clearly defined Factor II, 
the attitudinal characteristics of the teacher do 
not appear, but that the teacher’s sociability (T- 
20) occurs with a moderate loading. 

Apparently, there are at least two kinds of 
intellective change occurring among pupils. One 
seems to be of a non-informational change, the 
other seems to be informational. It seems that 
more of the teacher’s characteristics (knowledge 
of subject matter, attitudes, insight in pupils’ 
problems) affect the non-informational acquisi- 
tions of the pupils than their informational gains. 
It would be interesting to examine the impact of 
the teacher upon the personalities of children. 

T-10, T-14, and T-16 were ratings made by 
Rostker; the other ratings were made by the 
teachers’ supervisors. Clearly Factor II] is a 
rating factor. Apparently, the attempt to rate dif- 
ferent aspects of the teaching activity was not 
successful. The ‘‘halo’’ effect resulted in pretty 
much of an undifferentiated global rating of the 
teacher and her activities. Furthermore, none 
of the other variables in the study, with the ex- 
ception of the teacher’s intelligence and her 
knowledge of good educational practices, had load- 
ings in excess of 0.30 on this factor. As a pre- 
dictable criterion of teaching effectiveness, rat- 
ings of the teacher were of little value. 


The Remaining Factors 





Factors IV through X were the remaining 
components of the 35 variable domain. On only 
one of these seven factors were any of the criter- 
ion variables found. These seven factors repre- 
sent structuring of the domain of variables under 
consideration, but with the exception of Factor 





TABLE I 








QO 
° 
2. 
© 


Variable 


Loadings 
0 





informational 
non-informational 
informational 


Pupil gain, 
Pupil gain, 
Pupil gain, 
Pupil gain, 
Pupil gain (C-1 adjusted) 
Pupil gain (C-2 adjusted) 
Pupil gain (C-3 adjusted) 
Pupil gain (C-4 adjusted) 


' ! 
AOUPWNHK DHOCOULWN 


Social Attitudes Test 
Attitudes Toward Teaching 


HMAAARHQAqQQqQQAQKGKANA 
i] ' ! ! ! 


Rating of Teaching Skill 
Rating of Teaching Skill 
Bernreuter Fs, Sociability 


adage 


14 
DO Ie lll cell eel oul 
ao NOOO © 


N 
T-2 
T-2 


total (C-1, C-2, C-3) 


. 85 
. 03 
. 15 


Ability to Organize Research Materials 
ACE Psychological Examination 


Test: Recognition of Pupil Problems 
Teachers College Psychological Examination 
Community Planning Test (Unit Il) 

Rating of Teacher’s Classroom Activities 
Rating of Teacher’s Personality 


Knowledge of Good Educational Practices 
Rating of Teacher’s Personality 
Rating of Teacher’s Classroom Activities 





VIII they apparently bear no relationship to the 
criteria of teaching effectiveness. A brief dis - 
cussion of each factor follows: 

Factor IV, designated as Social Adjustment, 
was characterized by the Washburne Social Ad- 
justment Inventory, 0.86; the Bernreuter Fs,Fc, 
and Bn scales with loadings 0. 65, 0. 64 and 0. 60; 
and the scale of Teaching vs. Administrative A- 
bility, 0.53. This writer believes that the failure 
of the two Flanagan scales of the Bernreuter in- 
strument to separate on different factors is a re- 
sult of correlated error. Subtests of a battery 
frequently fail to give factor diagnosticity. This 
correlated error may be called response set, test 
bias, etc. Apparently, a diagnostic instrument 
contained in a single booklet tends to produce cor- 
related error, thus destroying the diagnosticity 
of the instrument. 


The Teaching vs. Administrative Ability 
Test (T-23) is asubscale of the Stanford Edu- 
cational Aptitude Test. This test furnishes three 
ipsative scales for measuring teaching vs. 
administrative abilities, teachings vs. research 
abilities, and administrative vs. research abil- 
ities. 

Factor V, like Factor IV, represents a per- 
sonality complex; the four Bernreuter scales have 








loadings as follows: Bd, 0. 85; Bs, 0. 77; Bn, 0. 66; 
and Fc, 0.45. The Teachers College Examination 
has a loading of -0.56. This factor seems to be 
a hodge-podge of personality characteristics 
probably produced by correlated error. 

Factor VI, designated as Non-Research Pro- 
clivity, is defined by the variables: Test of 
Teaching vs. Research Ability, 0.9; Test of Ad- 
ministrative vs. ResearchAbility, 0. 94;Commu- 
nity Planning Test (Unit II) 0. 44; and a test of the 
teachers understanding of pupils problems T-5, 
0.36. This near doublet is best defined by two 
ipSative subscales of Stanford Educational Apti- 
tude Test. The ipsative nature of these subscales 
suggests that teachers have less proclivity for re- 
search activity than either teaching or adminis - 
trative functions. 

Factor VII is best defined by a test of social 
open-mindedness, T-12, 0.83. Other moderate 
definers are T-9, knowledge of civics and govern- 
ment, 0.65; T-5, knowledge of pupils’ problems 
0.58; T-3, social attitudes, 0.57; and T-1, abil- 
ity to organize research material. 

T-12 is a test consisting of 475 true-false i- 
tems of the teachers’ knowledge inseven areas of 
human experience corresponding to the cardinal 
objectives of education. According to the manual, 








62 JOURNAL OF EXPERIMENTAL EDUCATION 


dogmatic and superstitious persons receive low 
scores while persons possessing ascientific out- 
look and an open mind receive high scores. Thus 
it seems quite consistent that T-1, a test of re - 
search ‘‘know-how”’, has an appreciable loading 
on this factor. 

Factor VIII is defined by T-4, attitudes toward 
teaching, 0.76 and T-22, knowledge of good edu- 
cational practices, 0.69. C-7 and C-3, both pu- 
pil gain variables have loadings of -0.40 and 
-0.33. This is the only factor other than the cri- 
teria factors I and Il on whichcriterion variables 
are found to have a loading in excess of 0. 30. 

Factor IX, designated Classroom Leader - 
ship, seems to describe that aspect of the teach- 
ing domain in which the teacher is a director of 
the learning process. This factor is made up of: 
T-17, test of classroom leadership, 0.77: T-23, 
administrative vs. teaching ability, 0.61; and 
T-9, knowledge of civics and government, 0. 45. 

Factor X is a specific factor defined by var- 
iable T-8, Health Test (UnitI). The loading for 
this variable is 0. 94. 





Summary 


The 35 variable domain of the teaching com- 
plex of this study was found to be structured into 
ten dimensions. The names of these factors are 
arbitrary. The writer did not have many of the 
original instruments and had to rely upon sec- 
ondary sources for their description. 

Factors I, Il, and III are of special interest 
because they are identified by criteria often used 
to define ‘‘teaching effectiveness’’: change in pu- 
pil behavior and rating of teaching ability. 
Change in pupil behavior was found to be of two 
kinds: non-informational and informational. The 
teacher characteristics: intelligence, knowledge 
of the subject matter being taught, attitudes, and 
understanding of pupil emotional problems seem 
to have greater impact upon non-informational 
acquisitions of the pupils than upontheir infor- 
mational gains. 

The other criterion, rating of teaching effec- 
tiveness, was found to incorporate to some de- 
gree the teacher’s intelligence and knowledge of 
good educational practices. Ratings of different 
aspects of the teaching complex did not result in 
much diagnosticity as evidenced by failure to sep- 
arate on different factors. 

The rest of the teaching domain was struc- 
tured into social adjustment, a personality trait 
characterized by self-sufficiency and dominance, 
a tendency to subordinate research functions to 
teaching and administrative functions, open- 
mindedness, attitudes toward teaching, and a 
trait characterized as class leadership. 





Factor Analysis of Rolfe’s Data (65) 





Introduction 


Like Rostker, Rolfe examined the relationship 
of change in pupil behavior to various character- 
istics of the teacher, Rolfe studied 47 teachers 
from one and two-room rural schools in Wiscon- 
sin. The pupil-change criterion was based on 404 
of their pupils. 

Change in pupil behavior was measured by 
forming a composite of the Wrightstone, Hill, and 
Unit batteries mentioned in the Rostker analysis. 
From this composite, variation inpupil ability 
as measured by 1.Q., mental age, and reading a- 
bility was partialled out. 

A second criterion of teaching efficiency ex- 
ists in this study in the form of ratings of the 
teacher and her teaching activities. Other var- 
iables included by Rolfe in his analysis were in- 
telligence; knowledge of subject matter; person- 
ality; aptitudes; attitudes; socio-economic status; 
and some characteristics of the school situation. 
Many of the tests and indexes used by Rostker 
were also used by Rolfe. 

These measures furnished a 31 x 31 correl- 
ation matrix (Rolfe, Table XX). Of the eleven 
varimax factors found, two were defined bythe 
criteria: pupil gain and rating of the teacher. As 
in the Rostker analysis, the variables whose 
loadings exceed 0.30 are reported for these two 
criterion factors. 

Factor I is a pupil-gain factor. The pupil- 
gain variable of this study is similar to C-8 of 
Rostker’s study in that essentially it is an ad- 
justed composite of the three measures of pupil 
change. The most distinctive difference between 
this factor and its correspondent, FactorlI, of the 
Rostker study is in its failure to pick up a loading 
for the teacher’s intelligence. Although the ACE 
Psychological Examination was included in this 
matrix, its loading on this factor was only -0. 05. 
It may be of interest to note that the teacher’s a- 
bility to recognize pupils’ problems was found to 
have a low moderate loading on these correspond- 
ing factors (see Table II shown at thetop of the 
next page). 

Factor II of this study is defined almost 
exclusively, as in Factor II of the Rostker analy- 
sis by ratings of the teacher and her activities. 
As in the Rostker analysis, low moderate loadings 
are found for pupil gain and social attitudes. Once 
again, the failure of the rating scales to separate 
on different factors suggests that different aspects 
of the teaching complex cannot be rated ina way 
to furnish reliable differences. For all practical 
purposes we may conclude that differences in the 
rating scales were only semantic variations. 





SCHMID 


TABLE II 





Variable 





Bernreuter Bs, Self-Sufficiency 
Washburne Social Adjustment Inventory 
Rating of Teacher’s Personal Fitness 
Rating of Teacher’s Personality 

Rating of Teacher’s Classroom Activities 
Rating of Teacher’s Personality 


Rating of Teaching Skill 


Social Attitudes Test 


Test: Recognition of Pupil Problems 
Teacher Socio-Economic Status 


Pupil Gain 





Using pupil gainasacriterion, Rolfe by mul- 
tiple regression found those variables which were 
the significant predictors. There seems tobe 
little resemblance between the predictors of his 
study and the factorial composition of the pupil 
gain factor of this analysis. 


The Remaining Factors 





As in the Rostker analysis, the remaining 
factors represented dimensions in the teaching 
complex which were unrelated to the criterion 
variables of teaching efficiency. Because of this 
lack of relationship to the criteria, these factors 
will not be described beyond an arbitrary name 
which seems to this writer most descriptive of 
each factor. 

The remaining factors are: knowledge of so- 
cial goals, non-research proclivity, personality, 
attitudes toward teaching, teaching intent, teach- 
ing preparation, salary, class size, and school 
district. 


Summary 


Two criterion factors of teaching efficiency 
were found: pupil gain and ratings. Toalarge 
measure these two factors were similar to two 
factors of the Rostker study. 


Factor Analysis of La Duke Data (45) 





Introduction 


The purpose of the La Duke study was to de- 
termine the validity of certain teacher tests and 
rating scales as measures of teaching efficiency 
when pupil change was employed as the criterion. 

Thirty-one teachers in one room rural schools 
were studied. Approximately 200 pupils served 
to furnish the sample for examining the pupil 
change criterion. Classes were limitedto 7th 





and 8th grades. 

A course of study, Community Living, was 
outlined by the State Department of Public In- 
struction. This course of study, consisting of 
eight units, was taught by all teachers according 
to a time schedule. Objective tests were con- 
structed to measure four hoped-for outcomes of 
instruction: appreciations, attitudes, infor ma- 
tion, and interests. The tests were administered 
at the beginning of the school year andagainat its 
end. To eliminate any possible ceiling effect for 
the better pupils and tocontrol for intellectual dif- 
ferences, a pupil-gain criterion was produced by 
adjusting, regression-wise, the post-test scores 
for I.Q., mental age, and pre-test variation. 

Five tests were administered tothe teachers: 
the ACE Psychological Examination to measure 
intelligence; a test of the teacher’s ability to rec- 
ognize causes and symptoms of pupil maladjust- 
ment and how the teacher would goabout correct- 
ing these emotional problems; a scale measuring 
attitudes toward teachers and the teaching profes- 
sion; a scale of educational progressivism-c on - 
servatism; a social proficiency scale to measure 
‘consideration for others’’. 

In addition, the teachers were rated with 
three different rating scales: ratings of classroom 
activities, teaching skill, and personality. The 
teachers’ superintendents rated them as well as 
their supervising teachers. A composite rating 
of the three scales for the superintendents and 
Supervising teachers was obtained to furnish two 
independent assessments of teaching effective- 
ness. 

These data furnished a 12 X 12 correlation 
matrix. Four varimax factors were found, three 
being defined by some measure of teaching effi- 
ciency: pupil gain and rating. 


Varimax Solution (see Table III at top of next page) 





Rostker’s study indicated two facets of pupil 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE III 





Variable 


Loadings 
Il iil 





Pupil Gain, Appreciations 
Pupil Gain, Attitudes 

Pupil Gain, Information 

Pupil Gain, Interests 

Pupil Gain, Composite of Above 


Teacher’s Intelligence (ACE Psy. Exam. ) 
Understanding Pupils’ Emotional Problems 


Attitudes Toward Teaching 


Educational Progress vs. Conservation 


Social Proficiency, Consideration of Others 


Superintendent’s Rating 
Supervising Teachers Rating 





change: non-informational and informational. In 
La Duke’s study, as a follow-up, four types of 
pupil change were hypothesized. The factor solu- 
tion indicates, however, that there were only two 
types of change. One type incorporated appre- 
ciations, attitudes, and information; the other 
type incorporated interests and attitudes. The 
Rostker data might have suggested an alignment 
of appreciations, attitudes, andinterests as non- 
informational, and the informational test as evi- 
dence of informational change. These data sug- 
gest a more complex structure of pupil change. 
Appreciation-change is the simplest definer of 
Factor I of the three change variables, although 
it is of near complexity two. Interests-change 
defines a second change factor (IV). Attitudes- 
change is of complexity two onbothI andIV. How- 
ever, information-change instead of beind inde- 
pendent of these two factors hasa high loading on 
Factor I and a low moderate loading on Factor 
Ill. These loadings contradict any hypothesized 
informational--non-informational bifurcation of 
the Rostker data. Instead, this study suggests 
that information and appreciations are changed 
Simultaneously, but changes occurring in inter- 
ests are independent of informational-apprec ia- 
tional change. 

It is of interest to note that two teacher char- 
acteristics load on Factor I: intelligence and un- 
derstanding of pupils’ emotional problems. The 
teacher’s attitudes toward teaching seem to be 
related to the interest gain factor. 

Both rating variables define Factor II]. Ap- 
parently, the progressive attitude of these teach- 
ers was related to how they were rated. Factor 
Ill is a near-specific factor defined by the social 
proficiency variable. 





Conclusions 


By correlation and regression analysis, La 
Duke found little relationship between the rating 
and pupil gain criteria. The separation of these 
criteria on different factors in thisanalysis as 
well as in the two previous analyses conforms to 
La Duke’s findings. La Duke found that the teach- 
er’s intelligence as measured by the ACE Psy- 
chological Examination was significantly related 
(0. 61) to his criterion of pupil gain. In this anal- 
ysis, we also find the intelligence of the teacher 
being related to a pupil gain factor defined by the 
appreciation and informationchange variables, but 
unrelated to any of the other factors. 

As in the previous studies, rating of teaching 
efficiency is unrelated to changes in pupil behavior. 


Factor Analysis of Lins’ Data (49) 





Introduction 


Lins collected data for the purpose of investi- 
gating the accuracy of predicting teaching efficiency 
as measured by ratings from measures collected 
during the period of institutional preparation. 

Correlations (Lins, 1946, Table XL) were 
available for 16 measures for approximately 58 
women teachers. In some cases, complete infor- 
mation about all variables was not available; con- 
sequently the correlations are based on varying 
numbers of cases. 

The criterion, Variable O, consisted of a 
composite of five supervisory ratings of the teach- 
er while on the job. The correlative variables 
were: high school rank, ACE Psychological Exam- 
ination, reading ability, various grade point 





SCHMID 65 


averages, evaluation of the teaching environment, 
practice teaching grades, prediction of teaching 
success from institutional data. 


Factor Analysis 





Four varimax factors were found forthe 
data. Table IV (see top of next page) presents 
the entire varimax solution. The code numbers 
of the variables are those used by Lins. 

Factor I is made up of variables which pri- 
marily reflect academic achievement. The cri- 
terion seems to bear little relationship to this 
factor. Factor Il largely represents tested intel- 
lective ability. Again, the criterion bears little 
relationship to this factor. This is not incompat- 
ible with the Rostker study where the loading of 
the ACE Psychological Examination was moder- 
ate for rated teacher efficiency. Factor lis a 
near-couplet, dominated primarily by an evalua- 
tion of the teaching situation. The negative rela- 
tionship between this variable and the criterion 
is explained by Lins as indicating that teachers in 
the less difficult positions were giv _ higher ef- 
ficiency ratings than teachers in the more diffi- 
cult situations. It is on Factor IV thatthe criter- 
ion has the highest loading. The dominant vari- 
able of this factor, however, is the interview rat- 
ing. Predictions of teaching efficiency from both 


objective and subjective information have only 
moderate loadings on this factor. 


Conclusions 


The criterion of supervisory ratings of teach- 
ing efficiency was found to be of complexity t wo. 
In this respect it differs from the previous stud- 
ies. However, one of the factors containing one 
of its loadings was defined almost exclusively by 
a judgment of the difficulty of the teaching situa- 
tion. The complexity two of the criterion may be 
peculiar to this situation. The criterion had its 
highest loading in the interview rating (Variable 
14). The interview rating was made before the 
teacher entered her teaching assignment after 
being graduated from college. Bymeans of an 
interview, the teachers were rated on profes- 
sional judgment, sociability, workhabits, gener- 
al impressions, etc. This factor suggests that 
perhaps a comprehensive personal interview with 
prospective teachers might be a potential 
predictor of subsequent rated teaching efficiency. 


Factor Analysis of Von Haden Data (80) 





Introduction 


Von Haden’s study attempted toanswer the 
question ‘‘Do materials such as interviews and 
autobiographies of students written during the 





period of their institutional training contain 
information concerning qualities that are related 
to rated success in teaching?’’ 

The subjects of this study were 58 women 
teachers. Seven measures were available for fac- 
tor analysis: 


1. Difficulty of the teaching situation. 


2. Interviews: two 40 minute interviews by two 
staff members of the University of Wisconsin. 
The interviewers rated the subjects on per - 
sonal traits as adaptability, resourcefulness, 
energy, professional judgment, etc. A com- 
posite measure of these two interview ratings 
was made. 


Autobiographies: the subjects, while students 
in the School of Education, had written educa- 
tional biographies. These were rated by three 
raters. The same traits as under interviews 
were combined to give this measure. 


Interview digests: digests were made of the 
interviews and resubmitted to the raters, the 
subjects being anonymous. Ratings of the sub- 
jects were made on this basis. 


Two traits, initiative and professional judg - 
ment, were found to be correlated with super- 
visory ratings. These were used as separate 
measures. 


Criterion: This was a composite of five su- 
pervisory ratings of the subjects’ teaching ef- 
ficiency. 


Factor Analysis 





Three varimax factors were found for these 
variables (see Table V on next page) 


Conclusions 


Factors I and III show that ratings of personal 
traits from interviews and from autobiographies 
yield different information about teachers. The 
criterion has loadings on both factors but is more 
closely related to interviews. This isconsistent 
with the Lins’ analysis. “‘Initiative’’ seems to 
be the key characteristic of the interview meas- 
ure and professional judgment of the autobio - 
graphical measure. 

Factor II involves the difficulty-of-situation 
and the criterion. Theteachers of thisstudy 
probably are the same as in the Lins study. It is 
interesting to note the invariancy of this factor al- 
though different variables are involved in Lins’s 
study and this one. 

In answer to Von Haden’s question ‘‘Do ma- 
terials such as interviews and autobiographies of 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE IV 








Variable 





onmnouh wn 


Evaluation of Teaching Situation 

High School Rank 

English Cooperative Test 

ACE Psychological Examination 

Cooperative Reading Test 

Predicted GPA (From Variables 2 and 3) 

Freshman and Sophomore GPA 

Major GPA 

Minor GPA 

GPA (In All Education Courses) 

Practice Teaching Grades 

Prediction of Teaching Effects from 
Objective Information 

Prediction of Teaching Effects by Advisor 

Interview Rating 

Prediction of Teaching Effects From 
Subjective Data 

Criterion: Rated Teaching Effects 





TABLE V 





Variable 





Difficulty of the Situation 
Ratings by Interview 

Ratings by Autobiography 
Ratings of Interview Digests 
Rating of Initiative 

Rating of Professional Judgment 
Rating of Teaching Efficiency 








SCHMID 67 


students written during their institutional training 
contain information concerning qualities that are 
related to success in teaching?’’, the answer 
seems to be ‘‘yes’’ for interviews. Autobiograph- 
ical material is identified with an independent 
factor which bears only a slight relationship to the 
criterion. The factorial similarity of the rating 
of initiative to the interview composite suggests 
that perhaps ‘‘initiative’’ is the key characteristic 
in the teacher’s rating complex. Rating of pro- 
fessional judgment was factorially similar to the 
autobiographical factor. 


Factor Analysis of Jones’s Data (42) 





Introduction 


Jones Table VI (see top of next page) had a- 
vailable 18 variables for correlation analysis. 
Two of these, Variable 0 and Variable 17, served 
as criteria of teaching efficiency. Variable 0 is 
a principal’s rating of the teacher’s efficiency. 
Variable 17 is a measure of pupil gain based on 
achievement tests administered at the beginning 
and end of a three month period. Finalscores on 
the achievement test were adjusted for pre-test, 
mental ability, andI.Q. differences. 

The other sixteen measures were intel- 
ligence, various grade point averages earned in 
college, experience, etc. 

Sixty-five teachers were studied. However, 
because of incomplete information, the correla- 
tions are based on varying numbers ofcases 
ranging from eight to sixty-five. The 
modal number of cases is approximately 54. 

Seven varimax factors were found. The two 
criterion variables defined two of these factors. 
Variables having loadings in excess of 0.30 are 
reported in tabular form for these criterion fac- 
tors. The same code numbers as those used by 
Jones are employed. 

Factor I is defined by gain in pupil knowledge. 
As in previous analyses, ratings of teacher effi- 
ciency have negligible loadings on this factor but 
serve to define another factor. The pupil gain 
factor fails to pick up the intelligence of the 
teacher as did the correspondent factor of the 
Rostker study. 

These two criterion factors on the remaining 
variables r present no clear or definitive picture. 


The Remaining Factors 





Factor III was represented by high loadings 
on the ACE Psychological Examination, the Hen- 
morrNelson Test of Mental Ability, a reading ex- 
amination, and the General Culture Test. ‘‘Intel- 
ligence’’ seems to be the description of this fac- 
tor. 

Factor IV was a doublet consisting of grades 
earned in the methods and in practice teaching 





courses. Factor V was defined by an activities 
and interest test and the Bell Adjustment I nv en- 
tory. Factor VI was made up of various grade 
point averages: total, freshman-sophomore, jun- 
ior-senior, and education courses. Factor VII 
was aspecific factor of teaching experience meas- 
ured in years. 


Summary 


Seven factors were found to account for the 18 
variable correlation matrix. Two of the factors, 
as criteria of teaching efficiency, were of inter- 
est: change in pupil knowledge and supervisory 
rating. Although intelligence was included in the 
correlation matrix, it failed to appear on the pupil 
gain factor as in other studies. Achievement in 
formal education courses seemed to be the most 
relevant variable to pupil gain. 

The ratings of teaching efficiency failed to 
have anything in common with the teacher meas- 
ures. 


Factor Analysis of Erickson’s Data (24) 





Introduction 


Erickson factored 42 teacher measures for 
64 teachers. The variables consisted of 9 teach- 
er ratings, sevenscales of the Thurstone Temper- 
ament Schedule, the sixteen scales of the Cattell 
Persomality Factor Questionnaire, and various 
measures acquired during the institutional prep- 
aration of the teachers. His analysis was done in 
an interesting fashion. He formed multiple group 
factors on the basis of logical clusters found in 
the nine rating measures. Loadings were then 
computed for the remaining 36 variables using his 
clusters as the defined dimensions. 

It was decided to re-factor Erickson’s data 
using the criteria employed for the other analyses 
of this report as well as applying the varimax to 
the complete 45 variable matrix. Twelve varimax 
factors were found, two of which are presented in 
tabular form because they represent criterion 
factors (see Table VII on the next page). 

Factor I corresponds closely to Factor I of 
Erickson’s study. Variable 18, PF-B, General 
Intelligence vs. Mental Defect, and Variable 29, 
PF-Q, Radicalism vs. Conservatism, had mod- 
erate loadings inthe Ericksonanalysis. However, 
in this analysis, the loadings of these variables 
were negligible. Rating of the teacher seems to 
be the only complex of this factor. 

Factor II only vaguely resembles Factor II 
of Erickson’s analysis. The ratings are about the 
same, but other teacher measures differ. On 
Erickson’s factor defined by Variables 5, 6, and 
7, the D-T and S-T scales of the Thurstone Tem- 
perament Schedule, the A, B, E, H, I, N, O,Q), 
Q3, and Q, scales of the Cattell P-F Q uestion- 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE VI 





Loadings 
Variable 





Principal’s Rating - M Blank 
Bell Adjustment Inventory 
GPA (Education Courses) 
General Culture Test 

State Department Rating 
Principal’s Rating - C Blank 
Pupil Gain, Achievement 





TABLE VII 








Loadings 
Variable 





Principal’s Acceptability Rating 
Principal’s Rating, lst Year 
Supervisor’s Rating, I 

Supervisor’s Rating, I 

Principal’s Rating, 2nd Year 
Teacher Rating by an Outside Agency 
Teacher’s Self-evaluation 

Rating by Other Teachers 

P-F, Scale G, Positive Character 
P-F, Scale Qs, Will Control 


Le el 


3 
4 
4) 
6 
7 
8 
2 
1 


2 
3 





naire, and the practice teaching grade had sig- 
nificant (above 0.30) loadings. 

This varimax factor seems somewhat more 
definitely characterized by fewer variables. A 
conjecture might be made that the ratings of this 
factor are oriented toward the positive character 
and will control elements of the personality make- 
up of teachers. 


The Remaining Factors 





The remaining varimax factors represented 
dimensions of this domain unrelated to the criteria 
of teaching efficiency. The best definers of these 
factors were: P-F (Q4), nervous tension; grades 
in education courses; P-F (A), cyclothymia vs. 
schizothymia; P-F (E), dominance or ascendance 
vs. submission; P-F (I), emotional sensitivity 
vs. tough maturity; P-F (M), bohemianism vs. 
practical concernedness; and P-F (B), general 
intelligence vs. mental defect. 





Summary 


Although Ericison found three factors resident 
in the rating scales, by the varimax criterion, only 
two factors were found in this analysis. One of 
these factors resembles Erickson’s factor quite 
closely, the other only somewhat. Factor I in- 
volves variables that are largely ratings. Factor 
Il, however, best defined by the teacher’s evalua- 
tion of herself, has included variables of a person- 
ality nature. 


Concluding Remarks 





Seven studies containing criteria of teaching 
effectiveness and other variables of the teaching 
complex were factor analyzed by the varimax meth- 
od. Certain general conclusions may be drawn. 


1. The two general criteria of teaching efficiency, 
pupil change and rating of teaching, in almost 





SCHMID 


all studies tend to become markers as definers 
of factors. The emergence of these variables 
as the key definers of factors was unanticipated 
at the beginning of the study. There is nothing 
in the factorial method that should artificially 
produce this result if the criteria weren’t 
logical definers of the domain. 


Pupil gain factors bear no relationship to rat- 
ing factors. 


There are at least two kinds of pupil change 
occurring under the influence of the teacher. 

Rostker’s study indicated that these might be 

representative of informational and non-infor- 
mational changes. However, La Duke’s data 
failed to substantiate this structuring, although 
two change factors were found. La Duke’s 
data suggested that the conception of pupil 
change as of four kinds: appreciations, atti- 
tudes, information, and interests, in an over- 
simplification of complex academic changes 
occurring in pupils. 


In general, ratings of teaching efficiency were 
unidimensional. However, Erickson’s data 
indicated that self-ratings might be an addi- 
tional criterion factor. 


Of the many correlative variables in these 
studies, only a few, and these not consistently 
so, bear relationship to the pupil change fac- 
tors. Among these were: the teacher’s intel- 
ligence, attitudes, ability to recognize, diag- 
nose, and correct pupil’s emotional problems, 








69 


and the teacher’s knowledge of the subject mat- 
ter. It should be emphasized that these vari- 
ables were neither consistently nor highly re- 
lated to the pupil change factors. In general, 
objectively measured personality character- 
istics were not found on pupil change factors. 


The rating factor remains enigmatic. In gen- 
eral, few correlative variables were related. 
However, the evaluation of teacher’s future 
proficiency by interview seems factorially bet- 
ter than by evaluation of autobiographical in- 
formation. 


FOOTNOTES 


The writer wishes to acknowledge the assist- 
ance of Dr. Willard E. North of Central Mis- 
souri State College in the collection of these 
data. 


. H. F. Kaiser, ‘‘The Varim ax Criterion for 


Analytic Rotation in Factor Analysis, ’’ Psy- 
chometrika, XXIII (1958), pp. 187-200. 


. K. J. Holzingerand H. H. Harmon, Factor 


Analysis: A Synthesis of Factorial Methods 


(Chicago: University of Chicago, 1941). 





and the Number of Factors, paper read at an 
informal conference at Washington University, 
St. Louis, May 14, 1960. 


. H. F. Kaiser, Comments on Communalities 





. L. Guttman, ‘‘Some Necessary Conditions for 


Common-factor Analysis,’’ Psy chometrika, 
XIX (1954), pp. 149-161. 





JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, September 1961) 


CHAPTER VII 


A NON-ADDITIVE APPROACH TO THE MEASUREMENT 
OF TEACHER EFFECTIVENESS 


LELAND E. JENSEN 


Many attempts have been made to develop val- 
id and reliable measures of teacher effectiveness, 
none wholly successful. The difficulties may re- 
side in any number of related areas of which the 
following appear to be most salient: a) prevailing 
theoretical constructs of human abilities, b) the 
elements of success, if there are such, that the 
researchers project, c) the measurement instru- 
ments which have been employed, and d) in the 
criteria of teacher effectiveness. There is trou- 
ble but no one is quite certain where it is. 

This paper aims to explore a non-statistical, 
non-additive hypothesis whichbrings into question 
the statistical approach so generally employed in 
the study of teacher effectiveness. The hypothe- 
sis to be tested is that good teachers possess, to 
a greater degree, thecharacteristics deemed im- 
portant by those making the evaluation, than do 
average teachers. The good teacher may be out- 
standing on only one quality, but his range on the 
other qualities must not drop below the critical 
level. The superior teacher will have more high 
level competencies among the variables than will 
the average teacher. 

This hypothesis asserts that there are a limit- 
ed few uncorrelated critical elements out of which 
success and failure may arise and that good teach- 
ers are persons who possess One or more talents 
to some marked degree. But good teachers donot 
possess deficiencies whichare considered critical 
by employing officials, other teachers, pupils, or 
the society in which we live, which may make 
them unacceptable and possibly, therefore, inef - 
fective. Their other talents may be randomly 
distributed throughout the middle of the distribu- 
tion. A poor teacher is any teacher who may have 
superior ability in many respects but one who has 
one or more critical deficiencies. The attention 
is focused in this survey upon the extremes of the 
distributions and the patterns of efficiency and in- 
efficiency. 

The uncorrelated qualities which will be tenta- 
tively identified for testing are as follows: 

1. Academic aptitude—scholarship and competen- 
cy in area of specialty. 

2. Professional adequacy—knowledge of pupil de- 
velopment, learning, and teaching methods. 








3. Personal acceptability—skill in human rela- 
lations, teacher-pupil relations, administra- 
tive acceptance, peer evaluation, group mem- 
bership and leadership. 

4. Motivation and interest in teac hing—socio- 

economic sufficiency, and family influence. 

5. Physical attributes—health, psycho-motor 
ability, and energy. 

With this hypothesis in mind, a number of the 
studies here summarized will be re-examined 
through the use of non-correlated techniques, with 
a view to ascertaining whether good and poor 
teachers conform to the hypothesis stated. Pos- 
sibly from this approach it may be possible to 
find a limited few definable characteristics which 
might be used to differentiate among good and 
poor teachers. Levin (5, p. 35)* seems to sup- 
port this view. He believes that scores for dif- 
ferent criteria must not be summed indiscrimin- 
ately; that the criteria should be narrowed; and 
that relationships should be sought for each cri- 
terion independently. By noting the rangeof 
scores on various measures according to a num- 
ber of criteria, it should then be possible to de- 
velop a better schema of the qualities on which 
superior teachers appear to excel- 

In order to explore this hypothesis from a non- 
statistical approach, three of the most carefully 
constructed factor analytic studies were chosen 
for which there were relatively complete data for 
all the subjects. From each study, the four best 
and the four least capable teachers were deter- 
mined on the criteria used by its author. The 
numerous measures were then re-examined to 
determine if there is support for the hypothesis. 
Although an analysis of the 48 cases selected on 
the five criteria used in the studies will not es- 
tablish the hypothesis, there were enough cases 
to disprove it. 


Example I 


The data for the first analysis are drawn from 
Von Haden’s study (9, p. 61). His thesis pertained 
ed to the use of subjective aata and its relation- 
ship to future teaching success as measured by: 
1. Composite of five supervisory ratings: Meet- 


* Reference numbers in this particular chapter refer to references at the end of this chapter. 





JENSEN 


ings were held of the evaluative staff com- 
posed of a methods and a professional course 
professor from the University of Wisconsin, 
two supervisors from the State Department of 
Public Instruction, and the principals involved, 
to establish standards and uniformity in the 
use of the rating sheet. 
Evaluation of teachers by pupils: The pupils 
were asked to rank the teacher under evalua- 
tion as compared to the other teachers the 
student had. 
Residual pupil gain: Two tests, separated by 
an interval of six months, were administered 
to the pupils in the respective subject areas. 
These evaluations were made during the latter 
part of the first year of teaching. The basic data 
for the eight teachers on the three criteria are 
presented in Tables I, Il andIII with the code 
numbers for the teachers as used by Von Haden. 
The criterion of residual pupil gain is shown in 
only Table III because, of the 48 teachers in the 
sample, only 17 were evaluated on this measure. 
So few of the teachers selected cn the other two 
criteria had been involved in the pupil gain testing 
that it was not practicable tolimit this study to 


those teachers who had been evaluated on all three. 


However, as will be discussed later, the correla- 
tion among the three total groups on each criter- 
ion was so insignificantly positive that it does no 
harm to our present exploration. 

It should also be noted that there are three pri- 
mary sources for the data, other than the criter- 
ia evaluations, which are: 1)Two one-hour struc- 
tured interviews conducted by twomembers of the 
staff of the School of Education, University of 
Wisconsin; 2) A detailed twenty-page data booklet; 
and 3) Student teaching evaluations by the super- 
visory staff. These data were collected while the 
subjects were seniors at the University of Wis- 
consin. 

Let us first examine the hypothesis regarding 
the teachers selected on the basis of supervisory 
ratings. As could be predicted from the sources 
of the data, there appears to be a high order of 
agreement between preservice ratings and super- 
visory evaluations in the field. Certainly con- 
stant working with teachers andteacher candi- 
dates should develop some commonalities among 
professionals in their concepts of a ‘‘good’’ teach- 
er. However, this does not minimize the extent 
to which the good teachers surpassed theless able 
teachers on many separate ratings by profession- 
ally trained ‘‘experts. ’’ 

After noting the general direction of the differ- 
ences, it becomes evident that some of the least 
able teachers on the criterion were rated higher 
than some of the best on the various qualities. 
This, however, has already been hypothe sized. 
The important issue is how low can a high level 





teacher fall on any of the measures before becom- 
ing ineffective. Teacher 5, who was most effec- 
tive on this criterion, had the lowest score of 2.7 
on Considerateness. Since none of the better 
teachers had a score below 3.0, two questions a- 
rise: a) Is considerateness acritical factor? 
And b) where is the cut-off point? 

Referring to other studies summarized within 
this monograph, such as Barr (1, p. 185), Hell- 
fritzsch (3, p. 166), Lamke (4, p. 217), and 
Schick (7), there appears to be evidence that con- 
siderateness is a critical factor. For example, 
in measuring personality factors, Hellfritzsch 
found that teachers possessing liberal social be- 
liefs tend to produce greater pupil growth. Barr 
and Lamke, in separate studies, found that good 
teachers, more than poor, are more likely to 
have abundant emotional responses and empathy 
toward another’s problems. However, when com- 
paring the rating of 2.7 on considerateness with 
the ratings of the poor teachers and with the pupil 
evaluation criterion, it is not clear if it is below 
the critical level. 

Examining the data forthe least able teachers, 
one finds a number of scoreswhich apparently 
fall below a decisive cut-off point. Teachers 4 
and 58 tend to substantiate the hypothesis if their 
numerous scores below 2.7 are on critical quali- 
ties. They are particularly low on General First 
Impression, Initiative, Professional Judgment, 
System of Values and Work Habits. However, for 
Teachers 53 and 47, thecase is not clear. Teach- 
er 47 had a score of 2. 3 on Energy and Initiative. 
Teacher 53 had only one low score of 2.7 on Pro- 
fessional Judgment and yet was selected as the 
least able on the criterion of Supervisory Ratings. 

On this criterion, there appears to be a num- 
ber of qualities, some relatedtoeachother which 
differentiate between the two groups. General 
First Impression by the interviewers seems to be 
the strongest of these. Work Habits, Profession- 
al Judgment, Adaptability, System of Values, En- 
ergy, and Initiative are digest evaluations which 
appear to be consistent with only a little support 
from the student’s actual report of the Hours 
Studied per Week. The supervisors rating of 
Practice Teaching and the subject’s self - report 
of Participation in Extra-Curricular Activities in 
high school also gave a strong indication toward 
which group a particular teacher wouldbe classi- 
fied. Although the Percent of Total Expenses 
Earned by the subjects and their self-appraisal of 
Health were in the same direction as the other 
measures, it is difficult to establish if any level 
within the range is critical. 

When applying the criterion of pupil evaluation 
(Table II), it is apparent that the directionality of 
the differences between goodand poor teachers is 
not as consistent as in the case of supervisory 





—" ee ‘OP6I “UISUODSTM Jo AjISZaATU ‘UOT}JeJIeSSIp “q ‘Ud 
*AQUETOTJJA 1ayIVaL, JO uoTOIpaid ay} ut peAojdurg eyeq Teuosiad jo sedAy ufejraD jo uoHenTeaq “| JAeqaay ‘uepeH UOA WOI] pa}deI}xe VyeQ :970N 
‘soinsevadl Iepnotjred uo 1aySty pasezaae anoj Jo dnor3 yoTyM ayedTpul SMOIIY sy» 
*YqUd9} JSOIVSU 9Y} 0} papuNnos Useq sAvY SBuTIEY, 








‘aay ste] 


es Ol 
“ON _ Supe 


ysea] mn0g 


So 
oe) 
fod 
” 
fos) 
oe) 


‘ON —_ BuyyeUu 


yseq anog 





| 
| 


IOJPRISY] | 


T 
| 
| 
| 
| 


tJostAsadng 


| 
| 
| 
| 
| 
| 
| 


| 
| 
| 
| 


quowm3pne | 


[euocIssajolig | 


| 
| 
} 
| 


pauseg sasued | 


| 
} 


b | 
} 


ATH99M | 
juamM3pne 
[euotssajoig | 


perIpnys sanoy | 
AyTIqeto0g 


aATyeTzTU] | 
Aytiqeydepy 
uotsseadayy | 
Wits Tetauar 
Ayatyearg 
pue aAtyerzquy 
SenjteA pue 
UOTEATIOW 
SvqeH Y1OM 


Zz 
S 
>. 
< 
‘S) 
=) 
Q 
= 
W 
< 
B 
Z 
Le 
= 
— 
fr 
se 
a 
*s 
mw 
~ 
1°) 
4 
< 
Z, 
og 
=) 
e) 
5 


s}iqeH y10M | 
| 
uoTyenTeag pidng 


Testeaddy yjeeH 
“ATPY ‘OlIIND- *}xq | 
Uy uOTyedTdT}Ieq | 
~Xq [®}JOL jo 

WItq jo aBeraay 
Sante, JO wajsfg 
Avenbapy Teto0g 
SSauaze lapisuog 








= (seq = ¢) 
(jSeq = S) ssulyeu (seq = G) 
Sj]saeZIq MaTAJaW] UO SZulyeY very Jo aysoduroD SuTyova | sZuljyey MaTAIaw] 


| 
| 
| 
| 
| 
| 


yeTHOOR wyed 
































+SUSHOVaEL FIAGV SSA1 UNOA GNV LSA YNOd YOU SONLLVU AYMOSIANAdGNS 40 NOMALIMD AHL OL GAIUVdNWOSD SV VLVd AALLOarans 


I TIGVL 





QP61 ‘UISUCOSTM JO AjISIeATUM ‘UOTEIIeSSIq “G'Ud ‘ADUeTOIIIT 


; 3 . ‘uk e1jyxe Byeq :3j}0N 
Jaqo.eVaL jO UOT/IIpeid ay} ul paAojdwy ejeq feuosied jo sadAl uTejle5D jo uonenfeay I eqseay ‘UepeH UOA WOI pe) 
. _ — . ‘samnsvew reMonsed uo 13ySsTy pasezaae anoj JO dnor3 YOTYM VyJeOIpUT SMOTIY 4% 


*"y}U9} }Sa7eaU 9Yy} O} papuNoI useq aARrY S3uNeY » 








| 


| bz 


| 

Le 

Lb 

62 ae | 


‘ON Suey 
}sve'T n0Og 


Z 
wm 
” 
Zz 
a 
ar) 


TZ 0's 


wo 
N 
lod 
N 


“ON Zuney 
yseqg snod 





saenTeA jo wajyshg 


juaw3pne 


Teuotssajoig 
uolIa}II9D 


poureq 
Suney Arostasedng 


Testeiddy yjeayH 
ul uoTjedronsed 
sasuadxgy [ej}oOL jo % 
SvqeH YIOM 
Adenbapy fe1sos 
juew3pne 
TeuoIssajoig 
aaATENTUT 
SSoua}elaptsuoy) 
Ayiqeydepy 
JostAsadng 
10}9NI}SU] 
uotssaidwy 
}SIIq Ter9ues 
Aytaneai9 
pue aAtjerjtuy 
san[eA pue 
uoT}BATIOW 
s}viqeH 410M 
AytTIqeto0g 


AT¥220M petpnys sz10H 
WySTq Jo aBeraay 


| 
| 





‘AIOY “O1IIND- ‘3x 





(seq ¢) - 
s3uryey (Qseq = G) 
Suryora,L sSuney MAAIa}U] 


aonorid 
Masi sta I eee 





(seq = ¢) 
s}se31q MAaIAI9}U UO SSuTjeY 9aI1YyL Jo ayisodwoD 




















japyoog eieg | 


*SUAHOVAL AAV SSA] YNOA GNV LSA YNOA YOU NOLLVNTIVAGT TidNd JO NOMWALMO AHL OL GAUVdWOD SV VLVG FZALLOaLans 


0 aTaVL 





Zz 
© 
< 
Ss) 
~ 
Q 
we 
H 
< 
‘= 
Zz 
i) 
5 
oe 
a) 
Ay 
* 
= 
A 
° 
4 
< 
Z 
e4 
=) 
2) 
ar) 


“OP6IT “UISUODSTAM JO Ayisraatuy ‘uol}eyJessig ‘dud 





‘AQUATOIJA 1dyoeal JO uoTdtIpaig ay} ut peAo|dwy eyeg Teuosiag jo 


sadA] utey19D Jo uoTjENTeAq ‘| JA9qsay ‘UapeEH UOA WIJ pa}deI}xXe BIEQ :39}0N 
‘samnsvaw remMoaed uo 19yS3ty pasezaae anoj Jo dno13 yoOryM a}BOTIPUT SMOIIY «x 


“YjU9} JS8TBaU BY} O} PapuNnoI Useq dABY s3uney * 














LS 


‘ON Surjyey 


js¥a'] nog 


8 


‘ON Burjyey 
seq Inog 





juaw3pne 


Teuorssajoig 
uolIaytIO 


Apeem 
uotjenfeag tdng 


parpnys sinoy 


‘APY ‘IIND- “xq 
Ayiyiqetses 


juaw3pne 


Teuolssajoig 
Aytarjeery 


@ATIETIUL 
Ayiqeidepy 
iostasedng 
10}ONI}SUT 
uolIssaidwuy 
IS4Igq Te19uay 
pue darjenruy 
sone, pue 
uoT}BANOW 
SvqeH YOM 





S}vIqeH YIOM 


“XH TeIOL Jo % 





ul UoTyeEd1onueg 

peureq sasued 

Aosenbapy Tersos 
Ssousdj}elapisuo) 


(seq = ¢) 
s3uney Qseq = Ss) 


WA jo eBe19ay 


Testeiddy wjyTeay 
SenTeA jo wayshs 





(Seq = S) 





aonovig 


+— 
| 

Buryora | s3uljey MATAIaqU] 
POO Beg $}S9a3IQ MIIAIBIUI UO SBuNeY very] Jo aytsodwo0g | 





Azostasadng 


UOI9WID Buney 








*SUFHOVSEL FIV SSAT UNOA GNV LSAG UNO UOdA NIVOD TidNd TVNGISSY JO NOMALMO AKL OL GFUVdWOD SV VLVdG ZALLOgrans 


Il ATV L 





JENSEN 


ratings. Also, the range of the ratings between 
the averages of the measures for the two groups 
was, in most cases, not as great. Yet the stu- 
dents chose teachers as superior who were rated 
markedly better on the first criterion. One u- 
nique element was the shift ofthe practice teach- 
ing rating by the instructors inthefield. The 
least able teachers were rated superior to the 
best by the cooperating teachers. It is also of in- 
terest to note that the interview ratings on socia - 
bility of the least able teachers were superior to 
the best teachers, although not markedly so. 
Could it be that these two evaluations are not spe- 
cific for effective and ineffective teachers? 

There appears to be a num ber of evaluations 
which differentiate betweenthe two groups of 
teachers to a marked extent. As under supervis- 
ory ratings, General First Impressionappears 
to discriminate between good and poor teachers 
selected by pupil evaluations. Professional Judg- 
ment and Work Habits, as rated from the inter- 
view digests, are in strong agreem ent with the 
criterion. The subjective data from the student 
booklet again show a consistent indication with the 
appraisal being the only self-evaluation which is 
more significant than were the measures on the 
first criterion. 

Teacher 38 was the only carry-over fromthe 
first criterion. However, 38 maintainedthe sec- 
ond best position and no teacher from the previous 
analysis appeared in a reverse position onthe 
continuum from best to poorest. It therefore 
seems that none of the most able teachers on the 
first criterion had a critical deficiency on the 
second. From the standpoint of supervisory rat- 
ings and pupil evaluations, the hypothesis has not 
as yet been refuted. 

The most interesting finding concerningthe 
teachers selected on the pupil gain criterion 
(Table II) is that those who producedgreater pu- 
pil gains were not rated consistently superior to 
the low pupil gain achieving teachers by the pupil 
evaluation criterion. For example, Teacher 48 
received the lowest rating ofalleight teachers on 
the pupil evaluation criterion and yet scored third 
highest in residual pupil gain of the 17 teachers 
employed in this phase of the study. Further- 
more, the difference between the two groups on 
the supervisory criterion ratings was negligible. 
This seems to support the belief that teachers 
who are held in high esteem by the students do 
not necessarily produce greater gains in the 
academic development of their pupils. 

A closer look at the individual ratings of these 
teachers on the supervisory ratings criterion in- 
dicates, with the exception of Teacher 4, that 
composite ratings gained t hr ough observation of 
teachers in their regular classrooms will not re- 
liably separate those teachers who produce su- 





perior pupil gains. The inconclusiveness of a 
comparison among the three criteria measures 
employed is even more striking when the corre- 
lations among them are studied. 

The highest correlation coefficient is .281 be- 
tween pupil evaluation and supervisory ratings. 
This can hardly be viewed as a significant cor- 
relation. It would appear that different aspects 
of a teacher’s effectiveness are measured by the 
three criteria. Therefore, qualities which are 
consistent factors in distinguishing between good 
and poor teachers on all three criteria should 
represent these different components of teacher 
efficiency. 

Let us now examine the differences onthe 
subjective measures. There are four averages 
which favor the less able teachers; namely the 
Supervisor’s Practice Teaching Grade, evalua- 
tion of Social Adequacy, Participation in Extra- 
Curricular Activities, and appraisal of Health. 
Of these, none are more thantwo-tenths ofa 
point better than the more able teachers. On the 
measures where the best teachers in promoting 
pupil gain were superior, only General First Im- 
pression, Instructor’s Practice Teaching Grade, 
Professional Judgment, and Work Habits were 
greater than three-tenths of a point better than 
the least able. 

The differences betweenthe best and the least 
able teachers appear to be minimized because of 
the few teachers evaluated onthiscriterion. And 
yet, the items that most strongly differentiate 
between the two groups seem to be consistent 
with the results of the application ofthe first two 
criteria. None of the superior teachers had an 
evaluation on the interview and digest material of 
lower than 2.5. However, of the four less able 
teachers on this criterion, only Teacher 4 was at 
or below this cut-off point on more than one rat- 
ing 


From the student data booklet, only the Per- 
cent of Total Expenses Earned while in college 
maintained the same directionality of the first two 
criteria. The less able teachers reported slight- 
ly greater Participation in Extra-Curricular Ac- 
tivities, higher appraisal of Health, and signifi- 
cantly more Hours Studied per Week. 


Conclusion 


Although the evidence from the application of 
the criterion of residual pupil gain was somewhat 
ambiguous, the measures which most strongly 
differentiated between the best and the least able 
teachers were almost identical to the first two 
findings. Teachers 4 and 38 were included in two 
of the tables and, in both cases, remained in ei- 
ther the most able or leastable group. The evi- 
dence from the original study indicates that, if a 





Z 
Ss) 
- 
< 
oO 
~ 
a 
wi 
Hw 
< 
= 
: 
ce 
a 
a 
» 
wl 
nH 
fo) 
I 
< 
Zz 
fe 
~ 
fe) 
bar) 





O'T uley TIdng Tenptsey 
LSZ° 950° 0'T uorjenyeaq Tidng 


ThZ° ost” | RGI* 18Z° Suijey Arostasadng 





Ia a Iq I Iq a 
UIey Tidng Tenpisey uUolzenNTVAW Tidnq SBuljzey ArzostAsadng 











(21:6) VIUALINO AAMHL ONOWYV SNOLLVISUYNOOUALNI AHL UO 
SHOUT YIFHL GNV NOLLVISUHOO AO SLNAIOI4AIOO YAACHO OUAZ 


AI FTEVL 





JENSEN 


larger group had been included to represent the 
extremes, more teachers would have been con- 
sistently selected. 

Von Haden’s use of subjective data, as it re- 
lates to three criteria ofteacher effectiveness, 
seems to indicate that a number of judgment rat- 
ings do differentiate between superior and aver- 
age teachers. The more able teachers, by his 
definition, were awarded higher ratings on most 
of the measures and they had decidedly fewer low 
ratings. 

The factors which did discriminate can be as- 
signed under the following qualitative headings: 


Professional Adequacy Personal Acceptability 





Professional Judgment General First Impres- 
Supervisor’s Practice sion 

Teaching Rating Initiative 
Instructor’s Practice 

Teaching Rating 


Physical Attributes 


Energy Rating 
Health Self- Appraisal 


Motivation 





Percent of Total Ex- 
penses Earned 
Participation in Extra- 
Curricular Activities 
Work Habits 


When administrative decisions have to be made, 
it appears that judgments relative to these qual- 
ities will be helpful in selecting the more able 
candidates from a group of applicants. 


Example 2 


The data for the second analysis are extracted 
from Lins’ thesis (6, p. 2). Both Von Haden and 
Lins were part of a teacher evaluation task force 
which intensely studied many aspects of teacher 
candidates as they progressed toward the teach- 
ing profession. The population and criteria em- 
ployed by Lins were identical to Von Haden’s 
study. However, Lins included such objective 
data as high school record, various grade point 
averages, and the results of achievement tests. 
Ratings were also made bythe same professional 
staff of the University of Wisconsin concerning 
the teacher candidate’s worth as a Director of 
Learning, Student Counselor, Member of the 
Community, and as a Member ofthe School Staff. 

’ Since the sample and criteria are identical to 
Example 1, the eight teachers in each of the Ta- 
bles V, VI and VII are also the same as present- 
ed in Tables I, land III. Therefore, the various 
scores on the three criteria have been omitted. 

Examine first the directionality of the differ- 
ences between the two groupsin Tables V, VI and 
Vil. It is somewhat surprising to find that the 





pattern is almost identical to that found in Exam- 
ple 1. Eventhough the measures are not the 
same, the teachers selected as most able by the 
three criteria generally surpassed the less able 
teachers. The largest spread between the two 
groups is again found on the supervisory rating 
criterion. 

As is evident from a careful study of the re- 
sults on the college norms of the Henmon-Nelson 
and American Council tests, the case of teacher 
intelligence, as purportedly measured by achieve- 
ment tests, are quite contradictory. The super- 
visory team selectedthose teachers as best who 
were five to seven points above those selected as 
less able. Conversely, the teachers who were 
selected as best by the pupils were sixtoeight 
points inferior to those chosen as the poorest. On 
the criterion of residual pupil gain, with a much 
smaller population, the difference between the two 
groups was negligible. Therefore, it appears that 
there is little evidence to support the use of an in- 
telligence or achievement measure inthe selection 
of teachers if they meet the academic require- 
ments and succeed in making normal progress 
toward graduation. 

Moving to the measures involving academic ap- 
titude, such as the candidate’s grade point inthe 
total college career, major and minor fields, and 
professional education, the differences are more 
clear-cut. The more able teachers, as selected 
on all three criteria, surpassed the less able. 
Particularly on Total Grade Point, most of the 
superior teachers completed their college careers 
with an academic average in excess of C plus. 
The only exception was Teacher 28 on the criter- 
ion of pupil evaluation. None of the teachers se- 
lected on the criterion of supervisory ratings as 
most able had a total grade point of much below 
the B level. 

The remaining two objective measures of 
Practice Teaching Grade and Percentile Rank in 
High School Graduation Class tended to differen- 
tiate between the two groups in the same manner 
as the other professionalindices. The most 
striking finding is that the supervisory criterion 
identified those as superior who had received A 
grades in Practice Teaching. There were also 
more teachers on the other two criteria who re- 
ceived an A rating than in the less able group. 
None of the superior teachers, on all three cri- 
teria, was awarded a grade below a B in Practice 
Teaching. Therefore, it appears that a grade of 
C, as awarded by the University of Wisconsin, is 
below the cut-off point for effective teachers. 

The results of the two one-hour interviews, as 
shown by the ratings of the candidates as Direc- 
tors of Learning, Student Counselors, and Mem- 
bers of the School Staff and Community, also 
clearly indicated toward which groupa given 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE V 


PRESERVICE DATA AS COMPARED TO THE CRITERION OF SUPERVISORY 
RATINGS FOR FOUR BEST AND FOUR LESS ABLE TEACHERS* 





Professional Ed. Grade Point 


Henmon-Nelson College 
American Council Test 
7 Semester Grade Point 
Major Field Grade Point 
Minor Area Grade Point 
Practice Teaching Grade 
Director of Learning 
Member of School Staff 
Member of Community 


High School Rank 
Student Counselor 


Four Best 





No. 
5 

38 
9 


Four Least 
No. 


53 


4 





* Ratings have been rounded to the nearest tenth. 
** Arrows indicate which group of four averaged higher on particular measures. 


Note: Data extracted from Lins, Leo Jose icti : sad 
; , ph, The Predictio : = 
tion, University of Wisconsin, 1946. n of Teaching Efficiency, Ph.D. Disserta 








JENSEN 


TABLE VI 


PRESERVICE DATA AS COMPARED TO THE CRITERION OF PUPIL EVALUATION 
FOR FOUR BEST AND FOUR LESS ABLE TEACHERS* 











Henmon- Nelson College 
American Council Test 
7 Semester Grade Point 
Major Field Grade Point 
Professional Ed. Grade Point 
Minor Area Grade Point 
Practice Teaching Grade 
High School Rank 
Director of Learning 
Student Counselor 
Member of School Staff 
Member of Community 


Four Best 





| 
| 
| 
| 


No. 
21 


> 
° 


38 


Four Least 
No. 
29 


47 





* Ratings have been rounded to the nearest tenth. 
** Arrows indicate which group of four averaged higher on particular measures. 


Note: Data extracted from Lins, Leo Joseph, The Prediction of Teaching Efficiency, Ph. D. 
Dissertation, University of Wisconsin, 1946. 








JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE VII 


PRESERVICE DATA AS COMPARED TO THE CRITERION OF RESIDUAL PUPIL 
GAIN FOR FOUR BEST AND FOUR LESS ABLE TEACHERS* 








Henmon-Nelson College 
American Council Test 

7 Semester Grade Point 
Major Field Grade Point 
Professional Ed. Grade Point 
Minor Area Grade Point 
Practice Teaching Grade 
Director of Learning 


High School Rank 
Member of School Staff 
Member of Community 


Student Counselor 


Four Best 
No. 
8 





55 


Four Least 
oO. 
57 
26 


4 


Least Ave, 





* Ratings have been rounded to the nearest tenth. 


** Arrows indicate which group of four averaged higher on particular measures. 
Note: Data extracted from Lins, Leo Joseph, The Prediction of Teaching Efficiency, Ph. D. 
Dissertation, University of Wisconsin, 1946. 








JENSEN 


teacher would be rated. A rating below two 
seemed invariably to cast a teacher into the less 
able category. This givesfurther evidence to the 
hypothesis that a good teacher must have the so- 
cial qualities deemed necessary for success in an 
interactional setting. 


Conclusion 


The application of the results of intelligence or 
achievement tests to the selection ofteacher can- 
didates does not seem warrantedfrom these find- 
ings. This assumes that the teacher candidates 
have met the requirements ofthe teacher training 
institution and are certified. The important fac- 
tors appear to be the levelof success in academic 
and professional course work. 

Candidates who possess average college-level 
abilities and are superior incourse work seem to 
be more effective teachers thando mentally su- 
perior people who did less well academically. 
This is not to say that highly capablestudents, 
who were academically very successful, were not 
selected as being able teachers by these three 
criteria. Superior college academic achievement, 
whether due to intellectual or motivational factors, 
appears to be the best indicator from preservice 
data. 

The practice teaching grade, perennially a 
controversial index of later Success, did serve in 
this sample to differentiate between the best and 
least able teachers. However, the two groups 
were only markedly different on the supervisory 
criterion. 

The subjective ratings from the interviews with 
the candidates, prior to their accepting positions, 
followed a pattern similar tothe practice teaching 
grade. The prospective teachers did exhibit 
characteristics which the professional interview- 
ers were able to recognize as more likely to pro- 
duce success. Success, in this sense, is defined 
as being selected by the supervisors’ and pupils’ 
evaluation as an excellent teacher. 

Lamke’s findings are cogent to this discussion 
(4, p. 217). Ina study involving teachers’ per- 
sonality traits, he concludesthat afactor analysis 
of the data ‘‘indicates that several patterns exist 
for good and poor teachers. Good teachers are 
more likely to be gregarious, adventurous, friv- 
olous, to have abundant emotional responses, 
strong artistic or sentimental interests, to be in- 
terested in the opposite sex, to be polished, and 
fastidious. Poor teachers are more likely than 
good teachers to be shy, cautious, conscientious, 
to lack emotional responses and artistic or senti- 
mental interests, to have a com paratively slight 
interest in the opposite sex, to be clumsy, easily 
pleased and more attentive to people than good 
teachers. ”’ 





Adapting the results of Lins’ data, as present- 
ed from the view of the extremes, to the hypo- 
thesized qualitative headings, the following fac- 
tors should be considered in teacher selection: 


Academic Aptitude 


Professional Adequacy 





Total Grade Point Practice Teaching 

Major FieldGrade Point Grade Point 

High School Rank Professional Education 
Grade Point 


Personal Acceptability 





Interview Ratings 
Example III 


The third pattern analysis is extracted from 
the thesis of Schwartz (8, p. 63). His investiga- 
tion concerned the measurement of primary 
source traits, as defined and developed by Cat- 
tell (2) to determine their value as a predictor of 
teacher effectiveness. The subjects were a group 
of thirty-four seniors consisting of twenty women 
and fourteen men enrolled inteacher education at 
the University of Wisconsin. 

The data-gathering devices were nineteen ob- 
jective tests of primary source traits, measuring 
ten factors of personality, developed by Dr. R. 
B. Cattell and Schwartz. From this battery, the 
results on nine tests which purported to measure 
six of the ten factors used in the investigation 
will be examined by the same method employed 
in the first two examples. 

Nine tests were selected because the statisti- 
cal procedures employed by Schwartz indicated 
that they would be the most helpful of the nineteen 
in identifying the best teachers on hiscriteria. 
From the implication ofthe usefulness of subjec- 
tive evaluations of personality factors in identi- 
fying the most able teachers in the previous ex- 
amples, it is hoped that objective measures of 
personality traits wouldfurther strengthen a non- 
additive pattern approach to the prediction of 
teacher efficiency. 

The tests of the six factorsof personality, us- 
ing the numbering system of Schwartz, are as 
shown at the top of page 85. 

Three of the five criteria employed by 
Schwartz, for which sufficient data were avail- 
able, will be used to select the teachers for this 
analysis. They are Student Teaching Grade, 
Professional Grade Point, and Total Grade Point. 
Tables VIII, [X, and X present the findings of 
these criteria as applied totheteacher personal- 
ity measures. 

The consistency of the direction of the first 
four items on all three tables is the first apparent 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE VIII 


PERSONALITY DATA AS COMPARED TO THE CRITERION OF STUDENT TEACHING 
GRADE FOR FOUR BEST AND FOUR LESS ABLE TEACHERS* 








Total 
Primary 
Criteria Traits Individual Tests 








Professional Grade 
Point Average 

1 a Humor Judgment 
3 b Suggestibility 

9 a Reaction Time 
10 b Perseveration 


to Authority 
5 c Fluctuation 


Total Grade 
Point Average 
Top and 
Bottom 1/3 

3 a Logical 
Assumptions 
Consonant/ 
Dissonant 

of Attitudes 

8 a Willingness 


Four Best 
Rating No. 
8 1 





— 
fo =] 
= 
> 
wo 
ur 
~w 
> 
~w 
tw 
wo 
w 


nw 
> 
— 
ao 
> 

oO 
— 
for) 
w 
w 


Four Least 

Rating No. 
1 14 
18 


26 


Least Ave. 














* Ratings have been rounded to the nearest tenth or whole number. 
** Arrows indicate which group of four averaged higher on particular measures. 
Note: Data extracted from Schwartz, Anthony N., A Study of the Discriminating Efficiency of Certain 


Objective Tests of the Primary Source Personality Traits of Teachers, Ph.D. Dissertation, 
University of Wisconsin, 1950. 











JENSEN 


TABLE IX 


PERSONALITY DATA AS COMPARED TO THE CRITERION OF TOTAL GRADE POINT 
AVERAGE FOR FOUR BEST AND FOUR LESS ABLE TEACHERS* 





Total 
Primary 
Criteria Traits Individual Tests 





Student Teaching 
Professional Grade 

Top and Bottom 1/3 

1 a Humor Judgment 

3 a Logical Assumptions 
Consonant/ Dissonant 


Point Average 
9 a Reaction Time 
10 b Perseveration 


5 c Fluctuation of 


Suggestibility to 
Attitudes 


Authority 
8 a Willingness 


1 b Tempo 
to Modify 


Four Best 





Rating No. 
2.9 24 


-~] 
oO 
~w 
w 
co 
— 


2.8 


ur 
— 
w 
> 
ul 
oO 


2.6 


Four Least 
Rating No. 
ae | 14 





1.3 18 
1.3 27 


1.4 5 














* Ratings have been rounded to the nearest tenth or whole number. 
** Arrows indicate which group of four averaged higher on particular measures. 
Note: Data extracted from Schwartz, Anthony N., A Study of the Discriminating Efficiency of Certain 
Objective Tests of the Primary Source Personality Traits of Teachers, Ph.D. Dissertation, 
University of Wisconsin, 1950. 











JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE X 


PERSONALITY DATA AS COMPARED TO THE CRITERION OF PROFESSIONAL GRADE POINT 
AVERAGE FOR FOUR BEST AND FOUR LESS ABLE TEACHERS* 





Total 
Primary 
Criteria Traits Individual Tests 











Point Average 
3 a Logical Assumptions 


Top and Bottom 1/3 
1 a Humor Judgment 
5 a Consorant/ Dissonant 


3 b Suggestibility to 
9a Reaction Time 
10 b Perseveration 


Student Teaching 
Authority 
5 c Fluctuation of 


Attitudes 
8 a Willingness 


Total Grade 
to Modify 


Four Best 
Rating No. 
3.0 23 





_ 
a> 
oO 
lor) 
i] 
So 
> 
> 
Li) 
ae 
Ls) 
co 


~w 
Cc 
ur 
— 
ur 
w 
> 
uo 
oO 
_ 
wo 
w 
co 


3.0 


Four Least 
Rating No. 
1.0 14 





1.8 16 








:. 19 








Ratings have been rounded to the nearest tenth or whole number. 
** Arrows indicate which group of four averaged higher on particular measures. 
Note: Data extracted from Schwartz, Anthony N., A Study of the Discriminating Efficiency of Certain 
Objective Tests of the Primary Source Personality Traits of Teachers, Ph.D. Dissertation, 
University of Wisconsin, 1950. 











JENSEN 


Factor I 


Cyclothymiavs. Schizothymia 
la Judgment of Humor 
1 b Tempo (Natural Speed) 


Factor I 


Emotional Maturity vs. Demoralized E m ot ion- 
ality 
3a Inability to Select Logical Assumptions 
3b Suggestibility to Authority 


Factor V 


Surgency vs. Agitated Melancholic Desurgency 
5 a Ratio: Consonant/Dissonant Statements 
Recalled 
5c Fluctuation of Attitudes 





Factor XIII 


Positive Integration vs. Immature Character 
8 a Willingness to Modify 


Factor IX 
Adventurous Cyclothymia vs. Withdrawn Schizo- 
thymia 
9a Reaction Time 


Factor X 


Neurasthenia vs. Vigorous Character 
10b Perseveration 





finding. Columns 1 and 2 indicate the extent to 
which the criteria agree withone another. It ap- 
pears that teachers selected onone criterion will 
tend to be chosen by theothertwo. However, al- 
though no teacher was selected in the best group 
with a Student Teaching Grade below four of B 


minus, Teacher 5 inthe low group on Total 
Grade point was awarded an A in Student Teach- 
ing and earned a B plus in Professional Grade 
Point. Therefore, it appearsthat noone of these 
measures, used as a criterion, can be supported, 
but rather that an assessment scheme must in- 
clude all three. 

The next two columns represent the number of 
times a particular teacher was in the top one- 
third or bottom one-third on all nineteen objec- 
tive tests of surface traits used in the original 
study. Schwartz (8, p. 76) concluded that ‘‘it 
would seem that those who have more pronounced 
traits did a better job in student teaching than 
those who were rather ‘colorless’ individuals’’. It 
appears that this conclusion holds true for the 
other two criteria when analyzed by four exam- 
ples at the extremes. 

On test 1 a-Humor Judgment, there does not 
appear to be a consistent pattern. This finding 
seems to be true for the other measures used, 
with ‘:e exception of tests 1 b-Tempo (Natural 
Speed), 3 b-Suggestibility to Authority, and 9 a- 
Reaction Time. These three tests maintained 
their directionality on each criterion. 

Test 1 b-Tempo was a measure of the candi- 
date’s normal work pattern as indicated by the 
time used to complete particular tasks. The 
tasks involved psycho-motor ability in numerous 
selection, sorting, and dexterity demanding ac- 
tivities. The candidates were not informed of the 
timed factor and were instructed to proceed ata 
normal pace. The teacher candidates chosen as 








most able on these three criteria used less time 
in completing the tasks. 

Test 3 b-Suggestibility to Authority measured 
the shift of a candidate’s reaction to fifty contro- 
versial statements. Three weeks after the candi- 
dates recorded their reactions to the statements, 
they were again presented. On this second test- 
ing, the source of each statement was given. The 
index was the amount of change to the vie w point 
of the authority by the candidates. Although the 
average of the amount of change on all criteria 
indicated that the best candidates shifted their o- 
pinion to a greater extent, the range of the two 
groups was not discrete. 

Test 9 a-Reaction Time was a measure of the 
elapsed time between the presenting of a visual 
stimulus and the candidate’s reaction of depres- 
sing a key. The superior candidates appear to be 
consistently quicker on this measure. 


Conclusion 


The adaptations by Schwartz of Cattell’s tests 
of primary source traits do not generally appear 
to be helpful in the selection of candidates on 
these particular criteria. Of the three measures 
which were consistent in maintaining their direc- 
tion, only Tempo and Reaction Time are discrete 
enough to warrant consideration. 

Although using the total score on theHumor 
Judgment test did not seem to be of value in the 
selection of teachers by this analysis, Schwartz 
found a positive correlation bet ween clusters of 
the items and the criteria. He concludes that ‘‘a 
sense of humor does have a significant relation- 
ship to success in the teaching profession. ’’ (8, p. 
102) 

Two qualitative factors in the assessment of 
teacher effectiveness hypotheses appear to receive 





JOURNAL OF EXPERIMENTAL EDUCATION 


some support from three items in this analysis. 
They are: 


Physical Attributes Personal Acceptability 





Tempo (Natural Speed) Humor 
Reaction Time 


It can be added that the results of the total battery 
of tests indicated that a modal or ‘‘colorless’”’ 
personality was seldom selected as a promising 
prospect for the teaching profession. 


Application of Findings 





This analysis was initiated with a non-additive 
hypothesis concerning the assessment of teacher 
effectiveness. Many correlational approaches to 
this problem have combinedthe various compo- 
nents which appeared to be significant to superior 
teaching with negligible results. We have hypo- 
thesized that there are a limited few non-over- 
lapping critical qualities and that each quality is 
essential. Therefore, a good teacher possesses 
at least one superior characteristic and does not 
fall below average on any quality deemed neces- 
sary. A teacher may be above average on oneor 
any number of characteristics, but if one compo- 
nent is missing or below average, the teacher 
will be ineffective. 

Five qualitative components were presented 
for testing by a directional pattern method. Mea- 
sures which appear to be in agreement with all 
the various criteria were assigned under one of 
these components. Althoughthe items under each 
quality did not appear to be significantly corre- 
lated with items under other com ponents, they 
tended to be strongly related to each other. Sev- 
enteen tests and ratings were found to meet the 
requirements and are as follows: 


Academic Aptitude 


Motivation 





Total Grade Point Percent of Total Ex- 

Major Field Grade Point penses 

High School Rank Work Habits 

Participation in Ex- 
tra Curricular Ac- 
tivities 


Professional Adequacy 





Professional Judgment 

Supervisor’s Practice 
Teaching Rating 

Instructor’s Practice Energy Rating 
Teaching Rating Health Self- Appraisal 

Professional Education Tempo 

Grade Point Reaction Time 


Physical Attributes 








Personal Acceptability 





General FirstImpression 
Initiative 
Humor 


If each of a group of teacher candidates were 
evaluated on the items inthis presentation, which 
were available to the rater, a pattern chart of 
predicted success could then be constructed as 
as hypothesized. For example, Teacher 38 (+) 
and Teacher 4 (-) from the first two sets of ta- 
bles would be represented according to the chart 
at the top of the next page. Teacher 38 would be 
evaluated as possessing strong possibilities of 
becoming a very effective teacher, whereas 
Teacher 4 would be viewed as a possible failure. 

It is recognized that neither all the necessary 
qualities nor measures may be included in this 
analysis due to the limitations of the original 
measurement instruments and the comparatively 
small samples employed. Yet, the results of the 
original investigations, when evaluated in this 
manner, do tend to indicate that a non-additive 
approach is helpful in the prediction of teacher 
effectiveness. 

The hypothesis can only be further substan- 
tiated or refuted with the use of a much larger 
sample. It would be necessary to involve many 
objective and subjective measures prior to and 
during actual professional teaching to construct, 
with some assurity, a model of discrete constel- 
lations or components of teacher effectiveness. 
It appears to be possible to develop a limited few 
qualities which represent all the necessary el- 
ements of effectiveness. The analysis of the pat- 
terns developed in these examples gives hope that 
a pursuit of this nature will be more rewarding 
than many previous attempts. 


REFERENCES 


1. Barr, A. S. ‘‘Recruitment for Teacher 
Training and Prediction of Teac hing Suc- 
cess,’’ Review of Educational Research, 
10: 185-190, June, 1940. 

Cattell, R. B. Descriptionand Measurement 
of Personality, World Book Co., 1946. 

Hellfritzsch, A. C. ‘“‘A Factor Analysis of 
Teaching Abilities, > JournalofExperi- 
mental Education, 14: 166-69, September, 
1945. 

Lamke, Tom A. ‘‘Personality and Teaching 


Success, ’’ Journal of Experimental Educa- 
tion, XX: December, 19ST pp. 217-260. 
Levin, H. ‘‘A New Perspective on Teacher 




















JENSEN 


0 1 2 3 4 


Inferior Below Average Average Above Average Superior 





Acdemic Aptitude 
Professional Adequacy 
Personal Acceptability 
Motivation 


Physical Attributes 





Competence Research,’’ Harvard Edu- criminatory Efficiency of Certain Tests of 
cational Review, 24: 35-42, Spring, 1954. the Primary Source Personality Traits of 
6. Lins, Leo Joseph, ‘‘The Predictionof Teach- Teachers, ’’ Journal of Experimental Ed- 
ing Efficiency, ’’ Journal of Experimental ucation, XIX: September, 1950, pp. 63-93. 
Education, XV: September, 1946, pp. 2-60. 
7. Schick, George J. The Predictive Value of 9. Von Haden, H. I. ‘‘Evaluation of Certain 

















a Teacher Judgment Test, Unpublished Types of Personal Data E mp |loyed in the 
Ph.D. Thesis, University of Wisconsin, Prediction of Teaching Efficiency,’’ Jour- 


1957. nal of Experimental Education, 15: 61-84, 
8. Schwartz, Anthony N. ‘‘A Study of the Dis- September, 1946. 














JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, eptember 1961) 


CHAPTER VIII 


THE ABILITIES AND PATTERNS OF BEHAVIORS OF GOOD AND POOR TEACHERS 


ARCHIE L. PERONTO 


As has been previously pointed out, evalua- 
tion of teacher effectiveness may be made for 
many purposes such as, for example, ce rtifica- 
tion, placement, employment, assignment, pro- 
motion, improvement in service, and salary de- 
termination. In some instances fine distinctions 
need to be made relative to effectiveness and its 
concomitants, and in other instances not. In hir- 
ing, firing, and salary commitments, only broad 
categories of effectiveness would seem to be nec- 
essary. It is from this point of view that we shall 
approach the problem in this chapter. The dis- 
cussion to follow is based upon the assumption that 
if one were able to set up, with reasonable accu- 
racy, three broad categories of effectiveness such 
as superior, average, and inferior, certain ad- 
ministrative needs of schools might be adequate- 
ly met. 

The lack of agreement on what should be con- 
sidered in teacher effectiveness coupled with un- 
certainty about the criterion of effectiveness con- 
stitutes what amounts to aroadblock in teacher e- 
valuation. Confusion is apparent not only in the 
employment of various criteria but also inthe ar- 
ray of labels which have been applied to teacher 
characteristics thought to be associated with ef-~- 
fectiveness. Possibly more attention to certain 
broad categories of effectiveness and less to mi- 
nutiae might clarify the situation. At least this is 
the approach to be applied in this summary. 

The good teacher is defined here as the 
teacher who is rated as good by supervisors, ad- 
ministrators, or teacher educators, or whose 
students appear to show substantial gains in the 
measured products of education as measured by 
the paper and pencil tests employed. The poor 
teacher is one who on the basis of either of these 
criteria is ranked in some bottom fraction of the 
group Studied. 

In striving to discover, what it is that deter- 
mines whether a teacher will succeed or fail, re- 
searchers have developed and tested many hypoth- 
eses. Some have studied indepth a specific 
quality believed essentialtoteac hing effective- 
ness. Others have employed various batteries of 
measuring devices in the hope that profiles of the 
good teacher and the poor teacher might emerge 
from the data. 

In this study it is hoped that by superimposing 
a number of studies one upon another or compar- 





ing them it may be possible: 1) to discover the 
patterns or clusters of characteristics that ap- 
pear to be associated with effective teaching, and 
2) to discover variations in the amounts of these 
characteristics that appear to be associated with 
success and failure. To this enda re-study will 
be made of the data from certain of the studies to 
ascertain the extent to which good and poor teach- 
ers may be differentiated. 

Using a composite rating derived from prac- 
tice teaching grades, placement bureau rating, 
and the principal’s M-Blank rating, Margaret 
Jones (41) divided a group of teachers into good 
and poor teachers. Her data, shown in Figure I, 
seem to indicate that some characteristics are 
common to good and poor teachers alike, while 
other characteristics appear todifferentiate good 
and poor teachers. It appears, for example, that 
the degree of emotional stability does not differ- 
entiate good and poor teachers. Good teachers, 
however, seem to be characterized by a prefer- 
ence for quickness of action and efficiency of pro- 
duction. They seem tobe more flexible in 
numerical abilities and in disposition. Significant 
differences in academic ability are indicated. The 
good teachers in Jones’ study are superior to the 
poor teachers in intelligence, knowledge of sub- 
ject matter, and professional knowledge. As 
measured in Jones’ study the good teacher is some- 
what more sociable and dominant thanthe poor 
teacher. Correlation coefficients of the six meas- 
ures found significant at the 1% level are given in 
Table I. 

Throughout these investigations personality 
has been given much attention. Inan effort to 
avoid the weaknesses inherent insubjective rating 
of teacher personality qualities such as the well- 
known halo effect, low reliability, semantic diffi- 
culties, and profusion of terms, many research- 
ers have devised paper and pencil tests to meas- 
ure personality traits. Cattell, for example, has 
developed the 16 P F personality test. Certain 
qualities of good and poor teachers were compared 
on the basis of this scale by Lamke (46). While 
Lamke concluded that Cattell’s test, as a whole, 
fails to identify good and poor teachers, he did, 
however, find important differences between good 
and poor teachers in such source traits as sur- 
gency versus desurgency, adventurous cyclo- 
thymia versus inherent withdrawn schizothymia, 

















Lee ——{ 
j 








. | 
Rate 











Lo 
— 
——— 
~-.J 
N 
. 











>i 






































ie) 
= 
Zz 
© 
om 
il 
a 

































































al el ral It if 6 8 
sV > 9? QS /y 2 S / &» 
, V28 os es 2S Ly PJ &/& 
x ~ 
/ paads jenydad1aqg juometduo0D proAs} "SON 119310 -Z 
AyprsTa 


(AytT1qtxeT 4) 
$jSa], uotjyeusazTy Jo Arayjeg uorj{sodsiq 





























jUaMIaAaTYOY 
adlAlasaidg 














aAINg “ered ulal ueW Ia WIWITZ, 
-prOjTINy :syrery Ayyeuosiadg 








SA TAVINVA ALNAML YOU SNVAW dNOUSD AO ATIAOUd 


1 Fun dlA 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE I 


COEFFICIENTS OF CORRELATION OF SIX MEAS- 
URES OF TEACHER CHARACTERISTICS TO A 
COMPOSITE RATING CRITERION 





Professional GPA 

Major Teaching Field GPA 

Two-digit Numbers, Addition 
Two-digit Numbers, Subtraction 
Two-digit Numbers, Mixed (Flexibility) 


General Activity Personality Trait 





TABLE II 


CRITERIA COMPARISON OF THE QUALITIES FOUND DISCRIMINATING 
BETWEEN GOOD AND POOR TEACHERS 





Pupil Change 





No. of Level of 
Discriminating Signif- 
Test Scales Test icance 





Hartman Social Attitudes 5 of 5 Hartman Social Attitudes 10% 


Wrightstone Civic Beliefs 4o0f 5 Wrightstone Civic Beliefs 10% 


Bernreuter Bn 1 of 5 Bernreuter Bn 5% 


Bernreuter Bd 1 of 5 Bernreuter Bd 20% 


Hartman Social Information 2 of 5 Hartman Social Information 


Yeager Attitude Toward 
Teaching 


Sims Socio-economic 
Status 


Stanford Ed. Aptitude T-A 
Stanford Ed. Aptitude T-R 


Washburn Soc. Adjustment 
Inventory 


Morris Trait Index L 








PERONTO 91 


and sophistication versus rough simplicity. Dif- 
ferences in response patterns concerned with these 
traits appear to indicate that various combinations 
of personality traits are found in good teachers 
and in poor teachers. No single ideal type of 
teacher was identified. 

Lamke felt that certain limited generali- 
zations might be possible on the basis of differ- 
ences in certain portions of the response patterns 
associated with the traits identified above, pro- 
vided that the size of the sample and the unknown 
validity of the measuring instrument were consid- 
ered. Differences in response patterns on one 
portion of the scale suggest that the good teachers 
in this study are unusually alert physically, men- 
tally, socially, and emotionally as indicated by 
above average tendencies to be adventurous, gre- 
garious, and frivolous, to be more interested in 
the opposite sex, and to have above average emo- 
tional responses and strong artistic or sentimental 
interests. The response patterns on another por- 
tion of the test are interpreted as indicatingthat 
good teachers appear tobe more talkative, cheer- 
ful, placid, frank and quick than poor teachers. 
Another portion of the response patterns seems to 
reveal that good teachers appear to be about aver- 
age in tendencies toward being polished, cool, and 
fastidious. Response patterns of poor teachers 
are interpreted to indicate tendencies toward be- 
ing clumsy, easily pleased, and more attentive to 
people than good teachers. These appear to be 
qualities which might be associated with overcon- 
formity. The evidence inthis study also appeared 
to indicate that good teachers tend to be more un- 
conventional, dominant, talkative, and subject to 
more nervous tension than poor teachers. 

The personality traits characterizing 
Lamke’s top teachers appear to be qualities which 
might be associated with good physical and mental 
health and a liking for people. Itis easy to under- 
stand that teachers with these qualities might tend 
to be rated as good. Animportant unanswered 
question is whether these teacher traits actually 
contribute to the personal growthand development 
of pupils. 

In an effort to secure a more adequate crite- 
rion for the differentiation of goodand poor teach- 
ers, Carlson (15) employed eight pupil tests de- 
signed to measure non-informational as well as 
informational growth in the social studies in sev- 
enth and eighth grade pupils in one and two room 
rural schools. As supplementary criteria, 
Carlson employed paper and pencil tests of 
teacher qualities and a composite rating derived 
from three rating scales. The pupil gain and the 
rating criteria were finally combined in the selec- 
tion of good and poor teachers who were compared 
on personality qualities measured by thirteen 
paper and pencil tests. 

When these criteria were employed, no per- 
sonality quality as measured by the thirteen paper 





and pencil tests was found to differentiate good and 
poor teachers at or above the 20% level of signifi- 
cance except the Hartman Social Attitudes Test. 

Data relative to personality qualities found to 
discriminate by the rating criterion and the quali- 
ties found to discriminate by the pupil change cri- 
terion are shown in Table Il. Qualities significant 
from the 5% to the 20% level are includedin the 
pupil change column to the right. Qualities found 
discriminating in one or more rating scales, and 
the number of scales in which they discriminate, 
are included in the rating column to the left. (See 
Table IJ. ) 

If the level of significance and the ratio of the 
number of discriminating ratings tothe total num- 
ber of ratings are accepted, it is noted that of the 
eleven teacher measures included in TableIl there 
is some agreement between the twocriteria on five 
qualities and a lack of agreement on six qualities. 

When the pupil change criterion and a compos- 
ite rating composed of the Michigan, Torgerson, 
and Almy-Sorenson teacher rating scales are com- 
bined, good and poor teachers in this study are 
differentiated by dominance, knowledge of mental 
hygiene, civic beliefs, social adjustment, s ocio- 
economic status, attitude toward teaching, social 
attitudes and information, and teaching ability con- 
trasted with research ability. 

The evidence from Carlson’s study seems t 0 
support conclusions reached by other investigators 
that different criteria and different versions of the 
same criterion measure different aspects of teach- 
ing effectiveness. Thus it seems likely that in the 
present stage of teacher measurement the use of 
several criteria each measuring different aspects 
of effectiveness, as in Carlson’s study, will fur- 
nish different answers as to who are the good and 
who are the poor teachers. 

In measuring the qualities believed to differ- 
entiate good and poor teachers, Carlson, Jones, 
and Lamke all employed paper and pencil tests A 
different approach is employed in Barr’s (4) study 
of good and poor teachers. While a rating of teacher 
qualities was made in this study, the principal data 
gathering device employed was performance rec- 
ords of the classroom behaviors of good and poor 
teachers of the social studies. The selection of the 
good and poor teachers studied was made on the 
basis of supervisory ratings and the assumption 
that in the larger school systems employing offi- 
cials are more critical in their selection of teach- 
ers and that the lower salaries of smaller systems 
attract poorer teachers. 

Barr found that good teachers as compared 
with poor teachers were more vigorous, more en- 
thusiastic, and happier, less attractive, more 
emotionally stable, more pleasant, sympathetic, 
and democratic, possessed a better speaking voice, 
and displayed a keener sense of humor. 

The data derived from observed and recorded 
teacher classroom behaviors show distinct group 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE It 


ELEMENTS OF STRENGTH FOUND IN THE PERFORMANCE 
OF 47 GOOD AND 47 POOR TEACHERS 





No. of Teachers 
Teacher Behavior or Quality Inferred 


from Behavior Poor Good 














Interest in Pupil Response 38 
Use of Illustrative Materials 36 
Knowledge of Subject Matter 35 


Well-developed Assignments 32 


Good Notebooks and Outside Reading 31 


Conversational Manner 25 
Wealth of Commentory Remarks 22 
Frequent Use of Pupil’s Experience 24 
Good Technique of Asking Questions 21 
Ability to Stimulate Interest 20 
Socialization of Class Work 16 
Supervised Study 11 


Willingness to Experiment 10 








PERONTO 


TABLE IV 


ELEMENTS OF WEAKNESS FOUND IN THE PERFORMANCE 
OF 47 GOOD AND 47 POOR TEACHERS 








Teacher Behavior or Quality Inferred 
from Behavior 


No. of Teachers 








Poor Good 





No Provision for Individual Differences 


No Socialization 


Textbook Teaching 


Inability to Stimulate Interest 
Weak Discipline 

Lack of Evidence of Preparation 
Lack of Interest in Work 


Inadequate Knowledge of Subject Matter 


46 28 
43 00 
40 
39 


17 











94 JOURNAL OF EXPERIMENTAL EDUCATION 


differences in performance. (See Table III and 
Table IV.) 

More good than poor teachers appeared to be 
not only highly motivated themselves but also ap- 
peared to have more highly motivated pupils. 
Discipline appeared to be more or less ofa prob- 
lem in the classrooms of a majority of the poor 
teachers whereas no disciplinary situations were 
observed in the classrooms of the good teachers. 
Good teachers asked fewer fact questions and more 
thought questions than poor teachers. 


While Barr found insufficient provision for - 


individual differences in pupils by both good and 
poor teachers, this weakness was almost univer- 
sal among the poor teachers. Although peer-ap- 
praisal of pupil response wasrather infrequent 
in both groups, a tendency toward cooperative ap- 
praisal was evident in about half of the classrooms 
of the good teachers andalmost completely ab 
sent in the classrooms of poor teachers. 

A number of behaviors was found to be com- 
mon to good and poor teachers. Very little dif- 
ference was observed in the teacher-pupil ratio 
of recitation time consumed and in the length of 
teacher questions and pupilresponses. Barr 
found both good and poor teachers tending to dom- 
inate classroom communication. 

Judgments of specific teacher behaviors as 
being good or bad are made on the assumption that 
all teacher behaviors stimulate pupil learning of 
some kind. Teacher behaviors whichare believed 
to promote desirable learning by pupils are con- 
sidered to be evidence of teacher strengths, and 
those teacher behaviors which are believed to 
foster undesirable pupil learning are classifiedas 
teacher weaknesses. The quality of the learning 
experience is believed to be the only valid crite- 
rion of the effectiveness of the teacher. In the 
Barr investigation desirable pupil learning is in- 
ferred from observable pupil and teacher behav- 
iors; objective measurements of pupil growth were 
not made. Comparison of Barr’s findings with 
evidence from a similar type of study in which a 
pupil growth criterion is employed will provide 
some basis for evaluating the validity of Barr’s 
evidence. 

Jayne’s study (37) is one of the few investiga- 
tions reported in the literature in which recorded 
classroom behaviors are studied in relation to a 
pupil gain criterion. In Jayne’s study, since 
class recitation responses were tape~recorded, 
the data are restricted to oral interaction and 
subjects are not separated into good and poor 
teachers. There are two studies: inhis first study 
long-term gains and understanding were meas- 
ured, while inthe second study the teaching objec - 
tive was short-term gains, chiefly recall of fac - 
tual material. 

Jayne concluded that the ratio of teacher - 
pupil talk had little relationship to pupil gain in 
information. 





Jayne’s data appear to indicate that there is no 
difference in the relative value of thoughtand fact 
questions as far as pupil gain as measured is con- 
cerned. It may be that pupil tests employed by 
Jayne did not measure problem-solving ability. 
Jayne also found that the number of fact questions 
asked by the teacher is only slightly related to 
long-term gains as measured and to delayed recall, 
but significantly related to immediate recall in his 
second investigation. He found no relation between 
the number of thought questions asked and long- 
term gains or delayed recall, but found a high re- 
lationship to immediate recall. 

In regard to the appraisal of pupil responses, 
when long-term gains in understanding are empha- 
sized, Jayne found the number of questions raised 
concerning the correctness of pupil answers to be 
highly related to long-term gain but unrelated to 
short-term gain as measured. No relation was 
found between the number of corrections of mis- 
statements and long-term gains or delayed recall, 
but a significant relation to immediate recall was 
found. 

While Barr’s good teachers were judged to 
possess better speaking voices than his poor 
teachers, Jaynefound no relation between 
the teacher’s speech ability and either immediate 
or delayed recall of factual material inhis second 
study. 

In Jayne’s long-term gain study anIndex of 
Meaningful Discussion was derived from that por- 
tion of the data relating to the amount of dis- 
cussion growing out of pupil interests and experi- 
ences, bringing in additional information, clarifi- 
cation of ideas, and the challenging of pupils to 
support their opinions with evidence. These be- 
haviors appear to be reasonably comparable in 
purpose to the following behaviors reported by Barr 
which were found to differentiate good and poor 
teachers: use of illustrative materials, wealth of 
commentory remarks by the teacher, use of pupil 
experiences, and the ability to stimulate pupil in- 
terest. Jayne found teachers who scored high on 
the Index of Meaningful Discussion to be the more 
successful in promoting pupil long-term gain in his 
first study and the least successful in promoting 
short-term gain or the rapid learning of factual 
material, 

The employment of different criteria of effec- 
tiveness inthe studies by Barr and Jayne produces 
evidence which in some instances contradicts con- 
clusions drawn in the other study and in other in- 
stances supports the conclusions. In addition, the 
evidence in one of Jayne’s studies tends tocontra- 
dict the evidence in the other study. In many in- 
stances the conclusions derivedfrom Jayne’s first 
study support at least to some extent some of 
Barr’s conclusions. In this case the objectives in 
the two situations may be assumed tobe related to 
pupil growth in understanding, whereas in Jayne’s 
second study in which the objective was the rapid 





PERONTO 


acquisition of factual information, teacher behav- 
ior believed to promote pupil growth in under- 
standing tended to be either not related or else 
negatively related to the immediate recall of facts. 

It appears that whether a specific teacher behav- 
ior is good or poor teaching depends to some ex- 
tent upon objectives and upon the context in which 
the behavior occurs. 

This completes the summarization of inves- 
tigations of good and poor teachers chosen for 
special review here. None ofthese investigators 
has mapped the entire terrain of teacher qualities 
or behaviors. Hypotheses, data-gathering 
devices, criteria and conclusions vary from one 
study to another. In some instances the charac- 
teristics measured in these investigations overlap 
while in other instances there is little or no over- 
lap. While in superimposing the data obtained in 
these studies, one uponanother, perfectly clear 
contrasting profiles of good and poor teachers do 
not emerge, it is possible thatsome clarification 
of the differences between and likenesses among 
good and poor teachers may result from such 
comparisons. The overlapping areas inthese 
studies of good and poor teachers broadly include: 
1) knowledges, 2) interest and proficiency in 
teacher-pupil relationships, 3) physicalandemo- 
tional energy, 4) emotional stability, 5) flexibil- 
ity, 6) dominance, and 7) professional mo- 
tivation. Evidence from other studies, insome in- 
stances, will be examined for possible clarifica- 
tion of the data already discussed. 

Teacher training institutions generally 
assume that knowledge differentiates good and 
poor teachers. Differences of opinion concerning 
knowledge as a prerequisite to teacher effective- 
ness have arisen chiefly from conflicts of opinion 
about the relative importance of the several cat- 
egories of knowledge, as for example, knowledge 
of subject matter, professional knowledge, and 
knowledge of mental hygiene. 

In the Jones’ investigation good teachers were 
found to be superior to poor teachers in intelli- 
gence, knowledge of subject matter, and profes- 
sional knowledge as measured by the tests and 
criteria employed. Thirty-five of Barr’s 47 good 
teachers were rated as possessing adequate knowl- 
edge of subject matter while only seven of the 47 
poor teachers were so rated. If professional 
knowledge may be inferredfrom suchteacher be- 
haviors as the use of illustrative materials, well- 
developed assignments, wealth of com inentory 
remarks, frequent use of pupil experiences, abil- 
ity to stimulate interest, socialization of class 
work, provision for individual differences, and 
organization of subject matter in ways other than 
straight textbook organization, Barr’s good teach- 
ers would appear to possess more professional 
knowledge than his poor teachers. Carlson found 
good teachers somewhat above the average in their 





knowledge of mental hygiene as measured by the 
Torgerson Teacher-Pupil Relationships Test. If 
absence of discipline problems, provision for in- 
dividual differences, and opportunities for the ex- 
pression of pupil interests inclassroom activities 
are accepted as evidence of the implementation of 
the teacher’s knowledge of pupils, Barr’s evi- 
dence would appear to indicate that knowledge of 
these matters discriminates between the good and 
poor teachers in his study. 

What evidence there is inthese studies seems 
to indicate that professional knowledge, knowledge 
of subject matter, and knowledge of mental hygiene 
are discriminating characteristics. There is no 
evidence to the contrary. There issome evidence 
that the generalization does not apply to all indi- 
vidual cases and only to the means of groups of 
good and poor teachers. There is no evidence in 
these studies to justify any conclusions concerning 
the relative importance of the various categories 
of knowledge. 

It is commonly assumed that there is a dif- 
ference between good and poor teachers in interest 
and proficiency in teacher-pupil relationships. As 
used in this context, this characteristic includes 
the ability to score high on paper and pencil tests 
purporting to measure social adjustment and so - 
ciability, classroom behaviors whichap._ ear to be 
motivated by a liking for people, and high ration 
on qualities commonly associated with the ability 
to establish and maintain friendly relations with 
others. 

Differences in responses between good and 
poor teachers on the Guilford-Zimmerman T em- 
perament Survey in Jones’ study, shown in Table 
I, suggests that good teachers tend to be somewhat 
more sociable than poor teachers. When pupil 
gain and rating criterion are combined, Carlson 
found good teachers scoring above the group aver- 
age on the Washburn Social AdjustmentInventory. 
With ratings as the criterion, the evidence in 
Carlson’s study shows social adjustment sc ores 
discriminating between good and poor teachers in 
all five rating scales. In Lamke’s study the re- 
sponse patterns on portions of Cattell’s 16 PF 
Personality Inventory are interpreted as suggest- 
ing that good teachers tend to be more gregarious, 
talkative, and frivolous and more interested in the 
opposite sex. Barr judged the good teachers in 
his study to be more sympathetic, pleasant, and 
appreciative, happier as they worked with their 
pupils, to possess a keener sense of humor, and 
to appear more interested in pupil responses than 
the poor teachers. 

In Carlson’s study in which both ratings and 
pupil gain criteria are employed, the Washburn 
test did not discriminate between goodand poor 
teachers with a pupil gain criterion but in other 
studies it was found to discriminate. When pupil 
gain criteria were employed instudies by Rostker 





96 JOURNAL OF EXPERIMENTAL EDUCATION 


(66), Johnson (39), Gotham (30), Rolfe (65), and 
Lyon (50), although all correlations are positive, 
no Statistically significant correlations of criteria 
to the various measures of social adjustment 
or social intelligence employed were obtained. 
These results stand in rather sharp contrast to 
correlations obtained in investigations employing 
a rating criterion. 

The difference in results obtained with differ- 
ent criteria account for many of the conflicting 
findings. Although it may not seem logical, itis 
possible that the strict disciplinarian who may ap- 
pear not to be concerned about human relationships 
may secure the gains on tests such as have been 
used to measure pupil growth and achievement. 
The halo effect may operate to produce anapparent 
relationship but not a true relationship between 
the rated effectiveness of the teacher andmeas - 
ures of social characteristics. Paper and pencil 
tests devised to measure the social and personal 
prerequisites to teacher effectiveness may not do 
so. Concerning the pupil gain criterion most in- 
vestigators agree that the available devices meas- 
ure only certain aspects of pupil growth and de- 
velopment. While most attempt to do something 
with attitudes, the measuring devices are gener- 
ally quite inadequate. No measures attempted to 
to measure personality growth and development. 

The conclusions relative to the teacher’s in- 
terest and proficiency in teacher-pupil relation- 
ship as a determining factor in teacher effective- 
ness appear to depend upon the criterion. When 
pupil change is the criterion, the relationship is 
not significant; when rating is thecriterion, there 
appears to be a relationship. 

Physical and emotional energy is a third fac- 
tor identified in the studies of good and poor teach- 
ers as a possible differentiating characteri stic. 
Lamke and Jones found good teachers’ responses 
on temperament tests indicating quickness of ac- 
tion and efficiency in production. Lamke found 
good teachers to be above average in emotional 
response. Barr rated good teachers as being 


more vigorous and enthusiastic than poor teachers. 


Evidence from other studies in which temper- 
ament measures were employed is not complete- 
ly in agreement. Montross (57) found good teach- 
ers possessing more speed inperforming psycho- 
motor tasks. Schwartz (72) found reaction time 
negatively related to criterion. Erickson (24) ob- 
tained low correlations of criterion to scores on 
the active and vigorous response portions of the 
Thurstone Temperament Schedule. The weight of 
evidence in the studies examined appears to sup- 
port the conclusion that physical and emotional 
energy is to some degree a differentiating char- 
acteristic of good and poor teachers. 

Measures of qualities which appear to be re- 
lated to emotional stability were employed in all 
the studies of good and poor teachers included in 
this discussion. Lamke’s data suggest that good 





teachers are more placidand cheerful and subject 
to more nervous tension than poor teachers. 
Emotional stability as measured by the Bernreuter 
test appears to discriminate somewhat more in 
Carlson’s study when pupil gain is the criterion 
than when rating is the criterion, although it is 
difficult to be certain as to what this means. Of 
the twenty teacher measures employed in the 
Jones’ study, emotional stability as measured by a 
sub-test of the Guilford-Zimmerman Temper - 
ament Survey is least related to her composite 
rating criterion. Barr noted a greater degree of 
self-control in good teachers than in poor teach- 
ers. The evidence inthese Studies is certainly 
inconclusive and probably somewhatcontradic- 
tory. 

Gotham (30), Rostker (66), Rolfe (65) and 
Hellfritzsch (34), on the other handare in agree- 
ment that emotional stability as measured by the 
Bernreuter test is not significantly related to their 
pupil gain criterion. To add further confusion, a 
summary of investigations by Barr et al., (6) re- 
ports that 33 investigators found a positive rela- 
tionship between emotional stability and criterion. 
Thirteen zero correlations and no negative cor- 
relations are reported in this summary. While 
there is no lack of evidence to support this point 
of view there is a lack of evidence that is conclu- 
sive enough to generalize with assurance on the 
discriminating power of emotional stability as 
measured by the devices employed inthese inves- 
tigations. 

The evidence in the literature on dominance 
as a discrimination factor in teacher effectiveness 
is possibly more inconclusive than the evidence on 
emotional stability. Carlson concluded that dom- 
inance as measured by the Bernreuter test dif- 
ferentiates between good and poor teachers when 
rating and pupil gain are combined. Dominance 
discriminated in one of five rating scales and at 
the 20% level to pupil gain. Jones found good 
teachers somewhat more dominant as measured by 
the Bernreuter test than poor teachers. Lamke 
found good teachers more dominant as measured 
by the Cattell inventory than poor teachers. 

In studies by Gotham, Rostker, Rolfe, and 
Hellfritzsch dominance appeared to be positively 
but not significantly related to pupilgain. Ina 
summary of investigations by Barr, et.al.,twenty- 
four investigations are reported in which domi- 
nance was found to be positively related to crite- 
rion, nine in which the correlation was zero, and 
none reporting negative correlations. 

The weight of evidence appears to support the 
conclusion that dominance as measuredis possibly 
a minor discriminating quality. 

Flexibility as contrasted with rigidity is com- 
monly believed to be a desirable quality in teach- 
ers. Jones found good teachers possessing a more 
flexible disposition as measured by the Guilford- 
Zimmerman paper and pencil tests than poor 





PERONTO 97 


teachers. Lamke’s data suggest that good teach- 
ers appear to be more adventurous and unconven- 
tional than poor teachers. These qualities would 
appear to be suggestive of flexible disposition. 
Differences in the recorded behaviors of good and 
poor teachers led Barr to conclude that good 
teachers are more willing to experiment than poor 
teachers. The evidence in these studies points in 
the same direction and seems tosuggest that 
there may be some difference in flexibility between 
good and poor teachers, although it does notap- 
pear to be a critical factor. 

In the past two decades considerable effort 
has been focused upon the study and measurement 
of motivation. Since this topic is discussed else- 
where in this paper, the discussion here will be 
limited to the evidence in the investigations 
of good and poor teachers. Carlson found attitude 
toward teaching as measured by the Yeager teach- 
er attitude test to be a discriminating quality when 
pupil gain alone was employed as a criterion, at 
the 20% level of confidence, and when rating and 
pupil gain were combined but not when rating a- 
lone was the criterion. Carefully recorded behav- 
iors of teachers over a period of time may well 
provide as valid a measurement of professional 
motivation as available paper and pencil tests 
purporting to measure this characteristic. If in- 
terest in teaching may be inferred from such 
teacher behaviors as interest inpupil response, 
use of illustrative materials, ability to stimulate 
pupil interest, amount of daily preparation, and 
degree of enthusiasm displayed, Barr’s evidence 
appears to suggest that good teachers display 
more highly motivated behavior than poor teach- 
ers. 

In the foregoing discussion evidence from 
certain studies of good and poor teachers has been 
summarized, compared, and in some instances 
supplemented by evidence from other studies. 
Seven qualities were found to overlap in varying 
degrees in these studies: 1) knowledge, 2) inter- 
est and proficiency in teacher-pupil relationships 
and affiliated problems, 3) physicaland emotion- 
al energy, 4) emotional stability, 5) flexibility, 
6) dominance, and 7) professional motivation. 

The evidence in these studiesseems to Sup- 
port the conclusion that there are measuruble 
differences between good and poor teachers in 
certain qualities and, of course, little difference 
in certain other qualities. Some characteristics 
appear to be critical. Others appear to be con- 
tributing factors and essential in only minimal a- 
mounts. In some instances a higher degree of 
the quality does not appear to increase effective- 
ness. The evidence is not conclusive. More ev- 
idence than is available in these studies is needed 
to establish a firm relation between some of the 
qualities measured and teacher effectiveness, 
The problem of identifying patterns of character- 
istics which differentiate good and poor teachers 





is compounded by many things, but particularly 
those arising from the use of different and inade- 
quate criteria and different measuring devices that 
may or may not be reliable. How situational dif- 
ferences affect the importance of these qualities 
is a problem for future research. 

The evidence presented seems to support the 
thought that academic and professional knowledge 
are important qualities differentiating good and 
poor teachers. 

The difference between good and poor teachers 
in interest and proficiency in teacher-pupil rela- 
tionships is found to vary with the criterion. 
When rating is the criterion the responses of good 
teachers as measured in these studies appear to 
indicate that they possess greaterinterest and 
proficiency in teacher-pupil relationships than 
poor teachers. The evidence from other studies 
employing pupil gain as the criterion seems to 
support the conclusion that interest and pro- 
ficiency in teacher-pupil relationships is not a 
differentiating characteristic. Since the two cri- 
teria apparently do not measure the same educa- 
tional objectives the evidence suggests that con- 
cern over teacher-pupil relationships is not 
related to pupil growth in academic achievement 
but may be related to the pupil’s personal growth 
which is not measured in these investigations. The 
degree to which this may be true is difficult toas- 
sess on the basis of available evidence. 

Although the evidence is not conclusive, the 
weight of the evidence indicates that there is some 
reason to believe that physical and emotional en- 
ergy differentiates to some extent between good 
and poor teachers. It may be that a certain min- 
imal degree of this quality is essential to teacher 
effectiveness but excessive amounts may become 
detrimental. 

While emotional stability would seem logically 
to be a prerequisite to teacher effectiveness, the 
evidence here examined based upon the tests used 
indicates that different investigators get different 
results varying from no relation toa moderate re- 
lation. It is recognized that available paper and 
pencil tests which purport to measure emotional 
stability are inadequate. There is also aselective 
factor. Few extremely unstable teachers enter or 
remain long in the teaching profession. The dif- 
ferences in the people that enter teaching are not 
great enough in this respect to become in many in- 
stances a critical factor. It seems likely that a 
certain amount of emotional stability is essential 
to successful teaching. 

The evidence concerning dominance as a dis- 
criminating factor in teacher effectiveness pre - 
sents very much the same picture as has been 
noted above in regard to emotional stability and 
physical and emotional energy. It is likely that 
few cases at either extreme are represented inthe 
teacher populations studied. It seems likely, too, 
that it would be the extreme cases which might be 





98 JOURNAL OF EXPERIMENTAL EDUCATION 


associated with ineffective teaching and the me- 
dium range most likely to be related to effective 
teaching. The differences within the medium 
range are difficult to measure accurately with a- 
vailable measuring devices. The weight of evi- 
dence, however, seems to indicate that good 
teachers are as dominant or slightly more domi- 
nant than poor teachers, but dominance as meas- 
ured in these studies appears not to discriminate 
highly between good and poor teachers. 

The evidence on the discriminating power of 
flexibility and of professional motivation indicates 
a positive relation to effective teaching but again 
the evidence is insufficient to provide a basis for 
making a firm generalization about this. Some 
minimal degree of flexibility would seem tobe es- 
sential. Motivation has often proven to be an in- 
tervening variable between ability and achieve- 





ment. Establishment of the relation between these 
qualities and teacher effectiveness, as with cer- 
tain other characteristics considered in this dis- 
cussion, appears to be a problem of measurement. 

Of the characteristics believed to diff eren- 
tiate good and poor teachers, only knowledge of 
subject matter and pupils and professional know- 
ledge appear to be definitely established as dis - 
criminating between good and poor teachers. In- 
terest and proficiency in teacher-pupil relation- 
ships appears to be related to the personal growth 
of pupils but unrelated to academic ac hievement. 
Good teachers seem to possess some minimal de- 
gree of physical and emotional energy, emotional 
stability, dominance, and flexibility. The evidence 
in these studies of good and poor teachers onpro- 
fessional motivation is too limited te warrant 
drawing conclusions. 





JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, September 1961) 


CHAPTER Ix 


THE PERSONAL PREREQUISITES TO TEACHER EFFECTIVENESS 


A. S. BARR 


Beginning with the first teacher rating scale 
there has been continued interest in the personal 
prerequisites to teacher effectiveness. While 
there has been, during the last half century, a 
growing and persistent interest in the psychology 
of learning, in individual differences, and inchild 
development in relation to the teacher’s classroom 
responsibilities, the description of teacher behav- 
ior in terms of personal characteristics has con- 
tinued. Accordingly, one finds in literature today 
these two well established systems of describing 
teacher effectiveness, namely, one that describes 
teacher effectiveness in terms of the personal 
prerequisites, and one that describes teacher ef- 
fectiveness in terms of professional compe- 
tencies. 

It is the purpose of this chapter to consider 
some of the approaches made to the study of the 
personal prerequisites toteacher effectiveness in 
the investigations here summarized, and with 
what success. As in the other papers inthis mon- 
ograph not all of the researches relating to the 
personal prerequisites to teacher effectiveness 
will be considered but, rather, acertain selected 
few of the studies that seem to illustrate depar- 
ture from the commonplace insome important re- 
spect. 

As one examines the studies here summarized 
with reference to the personal prerequisites to 
teacher effectiveness one finds first of all that dif- 
ferent sorts of tests, rating scales, and other data 
gathering devices have been employed. Some- 
times the teachers were interviewed, sometimes 
observed at work, and sometimes asked to answer 
questions about their own behavior, attitudes, and 
values. All in all a very great variety of devices 
were employed. 

Barr, Torgerson, and others, for example, 
used three tests of personal characteristics and 
only one personality rating scale except that the 
teacher rating scales all contained items relative 
to personal fitness. Thethree tests were Social 
Intelligence Test, F. A. Moss, T. Hunt, K.T. 
Omwake, Center for Psychological Service, 
Washington, D. C., 1930; The Morris Trait Index 
L., E. H. Morris, Public School Publishing Com- 
pany, Bloomington, Illinois, 1930, and Personal 
Health Standard and Scale, T. D. Wood, Bureau 
of Publications, Teachers College, Columbia U- 
niversity,New York, 1925. The personality rating 














scale was an informal ten-point scale based upon 
thirty-three personality traits taken from the 
Charter-Waples Commonwealth Teacher Training 
Study, and other sources. The correlations with 
the various criteria are shown in Table I. 

None of the measures, except possibly social 
intelligence with an over-all composite criterion, 
and the personality rating, was statistically sig- 
nificant. The latter correlation was probably due 
to halo effect and the former probably due to some 
sort of generalized verbal aptitude. From data 
presented in the report of this study the three cri- 
teria employed in the study appeared to measure 
different things. 

Gotham (30) attempted a systematic study of 

personality and teacher effectiveness, employing 
two types of personality measures, namely, tests 
and rating scales. Three criteria were employed: 
a) supervisory ratings, b) tests.of qualities thought 
to be associated with teacher effectivensss, andc) 
‘residual pupil gain. Three standardized tests, 
three widely used teacher rating scales,and two 
unstandardized personality ratings were employed 
to measure the personality characteristics of the 
teachers with results as shown in Table Il. 

The high correlations (r=. 87) between the two 
unstandardized personality scales and the criterion 
of teacher rating scales are probably due to halo 
effect. The intercorrelations among the criteria 
were Ipc=. 13; rac=. 40; rgp=. 25. The only test that 
seemed to offer promise was the Washburne Social 
Adjustment Scale. The low correlations may be 
traceable in part toerrors ofmeasurement but 
probably mostly to the fact that the different 
criteria measure different aspects of teacher ef- 
fectiveness. 

Von Haden’s (82) study differs from the other 
studies here summarized in that the chief data gath- 
ering devices were interviews, autobiographies and 
statements by members of the department of educa- 
tion all recorded while the subjects were seniors 
enrolled in the teacher education program at the U- 
niversity of Wisconsin. The previously discussed 
Studies employed tests and rating scales. The data 
were carefully analyzed with reference to eight 
personal characteristics, namely adaptability, con- 
siderateness, energy, initiative, professional judg- 
ment, system of values and work habits. The 
scores on these characteristics andthe mean of the 
eight measured were correlated withthree criteria 








JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE I 











Teacher Composite of 


Measures Pupil Rating All Measures* 


Gain Scales 





Social Intelligence .13 .10 
Morris Trait Index L -.11 . 08 
Wood Health Scale .14 -.11 


Personality Rating . 30 . 35 





*Except measure correlated 


TABLE II 





Criteria 


(b) 





A. Three Tests of Personality 
S. Bernreuter Personality Inventory 


2. Washburne Social Adjustment 
Inventory 
3. Rudisill Personality Scale 


B. Three Teacher Rating Scales 
i. Michigan Teacher Rating Scale 
2. Torgerson Diagnostic Teacher 
Rating Scale 
3. Almy-Sorenson Rating Scale 


C. Unstandardized Personality Scales 
h, A Four-point Personality Scale . 87 15 
2. A Thirty-three Item Personality 
Scale . 87 .14 


. 04 
.14 


. 06 
. 35 
. 30 
. 39 
- 43 
. 36 
. 30 


. 35 





*The correlations between the rating scales and the supervisory ratings were gener- 
ally high but spurious due to the halo effect. 





of teacher effectiveness, namely, a)a composite 
of five supervisory ratings, b) a composite pupil 
evaluation, andc) residual pupil gain. If only 
those correlations that were statistically signifi- 
cant at either the 1% or 5% are considered, 45 
were with the supervisory rating criterion, 12 
with the pupil evaluation criterion, and only one 
with the pupil gain criterion. Two hundred and 
sixteen of the correlations were not statistically 
significant. The statistically significant corre - 
lations for the several characteristics were as 
follows: initiative 11, professional judgment 9 
work habits 8, energy 4, system of values 4, 
adaptability 3, and considerateness 2. 

One of the unique features of this study was 
the use of personal characteristics as intervening 
variables, i.e., the analysis goes from recorded 
behaviors to inferred personal characteristics,to 
the criterion. There would appear to be adequate 
data in these investigations to lead one to infer 
that, as valuable as a behavioral approach is for 
data gathering and defining teacher characteris- 
tics, teacher effectiveness cannot be defined in 
terms of behaviors out of context, i.e. . divorced 
from purposes, persons, and situations. When 
the appropriateness aspect of behaviorsis dis- 
regarded the correlations would appear to ap- 
proach zero as a limit. 

Erickson (24) employed, among other meas- 
ures, seven measures of personal characteristics 
derived from the Thurstone Temperament Sched- 
ule and sixteen measures derived from the Cat- 
tell Sixteen Personality Factor Test. These were 
correlated with supervisory ratings, self-evalu- 
ation, peer evaluation, and pupil evaluation with 
correlations as shown in Table III. 

It is seldom that one finds more consistently 
low coefficients of correlation than here report- 
ed, all, with possibly one or two exceptions, sta- 
tistically insignificant. If one accepts these tests 
as being adequately validated against other popu- 
lations, and they have been extensively so used, 
then possibly we can give these teachers clear- 
ance, meaning that selective factors operated in 
a manner as to eliminate teachers that may fall 
in the critical areas for these tests. 

Singer (73) was interested in social c om pe- 
tency as a factor in teacher effectiveness. The 
intercorrelations among the social measures are 
indicated in Table IV, while the intercorrelations 
among the measures of teacher effectiveness are 
shown in Table V. 

The correlations of the measures of social 
competency with measures of teacher effective - 
ness are shown in Table VI. 

Singer, instead of investigating a number of 
characteristics, examined one in depth. As one 
examines the data, particularly the correlations 
with the supervisory rating criterion, some new 
insights are had as to the interrelationship among 
the several measures and this criterion. Allof 





BARR 101 


the correlations are positive and probably statis- 
tically significant except the teachers estimate of 
how well he is accepted by the students. While 
correlations are not high, they would appear to 
offer some promise for further investigation. 

Schwartz’s (72) study is interesting because 
it represents a new approach to the prediction of 
teacher effectiveness. A comparison ofthis study 
with Gotham’s study reveals how markedly dif- 
ferent these two studies are and gives some 
evidence of advance in personality measurement 
in the decade that separated these two studies. 
Using Cattell’s grouping of personality traits as a 
point of departure, Schwartz constructed new tests 
or adopted Cattell tests for use withteachers. His 
tests cover jokes, tempo, absence of questionable 
preferences in reading, perceptual speed of clo- 
sure, inability to state logical assumptions, sug- 
gestibility, cube perspective, ideomotor speed, 
ratio consonant/dissonant statements recalled, ra- 
tio purposeful to chance observation andmemory, 
two-hand coordination, immaturity of opinion, im- 
pairment of memory by emotion, reaction time, 
body sway, suggestibility and perseveration, with 
supervisory rating as the criterion, fairly sub- 
stantial correlations were found for two hand co - 
ordination (-. 60), reaction time (-. 53), temp jokes 
(-. 42) and perceptual speed of closure (-.38). A 
multiple correlation of .74 was foundfor three 
measures, namely, two hand correlations, reac- 
tion time, and the WashburneS oc ial Adjustment 
Inventory and the supervisory rating criterion 
which would appear to be an improvement over 
Gotham ’s findings from a prediction point of view. 

Montross’ (57) study was undertaken as a fol- 
low-up on Schwartz’s (72) study. Using acompos- 
ite of four in-service ratings made at the end of 
the first year of teaching as composite one and 
these ratings plus an additional rating by the prin- 
cipal made at the end of the second year of teach- 
ing as composite two, as the criterion, cor rela- 
tions were calculated between these criteria and 
a number of measures of temperament. The cal- 
culations for six of the objective measures and the 
two criteria are given in Table VII. 


Description of Tests: 





1. Speed of tapping (number of taps in 30 seconds 
on tapping instrument) was one of five meas- 
ures of tempo. 


Deviation in reaction time (forty reactions in 
a visual series with irregularly spaced time 
intervals) was one of four measures of speed. 


Fluency (number of adjectives todescribea 
house) was one of three measures of fluency. 


Right and left hand coordination (difference 
between the average number of taps of right 





JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE If 





Measures of Teaching Success 
3 4 5 6 7 





A. Thurstone Measures 
1. Active 


2. Vigorous 
3. Impulsive 
. Dominant 
. Stable 

6. Sociable 

7. Reflective 


B. Cattell Measures 
1. Cyclothymia 


2. Intelligence 


3. Emotional Stability 


. Dominant 
. Surgency 
. Positive Character 


. Adventurous 
Cyclothymia 


. Emotional Sensitivity 

. Paranoid Schizothymia 
. Bohemianism 

. Sophistication 


. Worrying 
Suspiciousness 


. Radicalism 
. Self-Sufficiency 
. Will Control 


. Nervous Tension 








TABLE IV 





Rating 





. Teacher-Teacher Social Distance 

. Teacher-Student Social Distance 

. Student-Teacher Social Distance 

. Social Competency Based upon Recordings 
. MMPI (Social Competence) 


. MMPI (Social Introversion) 





TABLE V 








Rating 








Composite Supervisory Ratings 
Teacher Self-Evaluation 
Pupil Reaction to Instruction 


Specialists Evaluation of Audio Recordings 


+. 565 
+. 524 


+. 262 
+.171 





TABLE VI 





Rating 


1 2 





Teacher-Teacher Social Distance (staff rating) 


Teacher-Student Social Distance 
Student-Teacher Social Distance 
Audio-Recordings Social Competency 
MMPI (Social Competency) 


MMPI (Social Introversion) 


.431 -. 088 


+. 


+. 


+. 


150 .068 
702 .069 
524 .171 
.567 -.255 


471 +. 776 








JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE VII 





Measure No 


»- 


Composites 
No. 2 








Speed of Tapping 


Deviation of Reaction Time ina 
Positive Direction .51 


Fluency (adjectives) . 41 
Right and Left Hand Coordination oT 


Maze Number 1 . 42 


47 


. 39 


. 28 


. 47 


. 36 





*The signs have all been reflected since negative correlations indi- 
cated positive relationships. 


TABLE VIII 








Measure 





. Speed of Tapping 

. Reaction Time 

. Fluency 

. Right-Left Hand Coordination 


. Maze Number 1 











and left hand) was one of two measures of dex- 
terity. 


Maze Number 1 (trace through a maze laid out 
in blocks comparable to a city street system) 
was one of three tests of disposition rigidity. 


The intercorrelations among these measures 
are shown in Table VIII. 

These findings are in substantial agreement 
with Schwartz’s findings which makes them some- 
what more interesting than the results from iso- 
lated studies. 

Margaret Jones (41) correlated scores on a 
number of measures of temperament andacom - 
posite rating criterion. The scores on measures 
of disposition rigidity, flexibility, general activ- 
ity, restraint, ascendence, and sociability cor - 
related significantly with a rating criterion and 
Separated good teachers from poor teachers. A 
number of the correlations were in the thirties 
and forties. The findings are significant because 
they seem to support those of Schwartz and Mont- 
ross and indicate that temperament measurement 
may be a fruitful field for predictors of teacher 
effectiveness. 

Flanagan (26) studied the Minnesota Multi- 
phasic Personality Inventory in relation to teach- 
er effectiveness. He found that a high coding of 
scale 3 (Hy), hysteria, was significantly related 
to teacher effectiveness, scale 5(Mf), masculini- 
ty, was found to be related to teacher effective- 
ness for both men and women but less firmly than 
scale 3. There also appeared to be an inverse 
relationship for 2(D), depression, for*°women. 
Drake interprets low coding of Scale 2 for women 
to relate to ‘‘socially out going characteristics. ’’ 
Not only did Flanagan study particular character- 
istics but also patterns of response. No meaning- 
ful patterns were discovered except men presented 
different personality patterns than women. Flan- 
agan concludes that while the natural processes of 
college life and the effects of testing, counseling, 
and guidance tend to eliminate the extreme.per - 
sonality deviate, the Minnesota Multiphasic Per- 
sonality Inventory has good possibilities as a pre- 
dictor of teacher effectiveness. The Minnesota 
Multiphasic Personality Inventory has been exten- 
sively studied with positive results. Society has 
placed high value upon the aspects of personality 
measured by this scale. It deserves wide use as 
a screening device and a clearance test. 


Summary and Observations 





1. Many different words are used to describe the 
personal characteristics of teachers. One of 
the problems confronting workers inthis area 
is how to reduce the list of descriptive terms 
according to some meaningful pattern. 





105 


The problem of measurement has not been 
solved. While a variety of data gathering de- 
vices were employed, such as tests, rating 
scales, self reporting inventories, i nter- 
views, and direct observation of behavior, 
none, except possibly the measurement of 
temperament and social competency, showed 
much validity. Most of the correlations were 
near zero, depending upon the criteria. 


The different investigators and constructors 
of data gathering devices defined the charac- 
teristics differently, and in most instances 
chose to measure different aspects of person- 
ality even where similar vocabulary was em- 
ployed. A difference of particular concern 
arises out of the fact that some investigators 
appeared to think of these personal character- 
istics as constituents of the person, i.e., as 
something within the personand others thought 
of the personal characteristic as external and 
inferred from a study of behavior, i.e., they 
employed the vocabulary to describe behavior. 
This would appear to be an important differ- 
ence. Some give to these personality char- 
acteristics the status of intervening variables 
with action regulating powers; others used 
these terms to merely describe behavior. The 
latter would appear to the writer to have much 
greater promise than the former. 


There is serious problem of definition. The 
terms employed in discussing the personal 
characteristics of teachers means many di f- 
ferent things to different people. No field needs, 
more than personality measurement, a mean- 
ingful system of definition, such as might be 
achieved through carefully constructed behav- 
ioral and/or operational definitions. 


While the terms used to characterize the per- 
sonal prerequisites to teacher effectiveness 

need to be solidly anchored in observable be - 
havior, behaviors, even when taken in context, 
which frequently they are not, are too numer- 
ous to provide a useful system for describing 

teacher effectiveness. Without getting too 
many intervening variables, and variables that 
are too ethereal to be varifiable, there is need 
for simplified schemata of reducing the number 

of things that educators need to keep in mind in 
the evaluation of teacher effectiveness. 


It has been frequently observed that different 
criteria measure different aspects of teacher 
effectiveness. Not too much can be achieved 
in the validation of personality measures until 
better criteria are developed. 





JOURNAL OF EXPERIMENTAL EDUCATION 


Possibly less use might well be made of self 
reporting devices and the conventional point 
value rating scale, and more use might be made 
of tests, observable behaviors, and measur- 
able personal characteristics. 


The most promising positive relationships 
were found for objective measures of emotion- 
al stability, social competency, certainscores 
on the Minnesota Multiphasic Personality In- 
ventory and the tests of temperament. 





JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, September 1961) 


CHAPTER X 


MOTIVATION OF TEACHERS AND TEACHING SUCCESS 


THOMAS A. RINGNESS 


That one’s motivation influences his behavior 
not only through initiation, but through determin- 
ing its direction, strength, and perseverance, is 
such common knowledge among psychologists that 
it is a matter of some surprise to see how com- 
paratively few studies of teacher effectiveness or 
prediction of teaching efficiency have taken this 
factor into account, except possibly other than in 
an incidental way. Motivation has been a special 
concern of personality theory builders, of social 
psychologists, of clinicians, of personnel special- 
ists and educational psychologists. This is not to 
say that no concern has been shown for the moti- 
vation of teachers in the studies here summarized, 
as we shall see, but the concern is frequently 
peripheral. Measures of teacher satisfactions 
and annoyances of reasons for choosing teaching 
as a career of social and professional attitudes 
and related matters appear to get lost in the wel- 
ter of things that may be relatedto teacher effec- 
tiveness. Accordingly, it appears to the writer 
that the problem of clarifying this area of study 
remains basically unsolved. 

Possibly the difficulties of working through the 
complexities of motivation theory have stood in 
the way of effective research in this area. Itis 
true that the formulations of motivation theory are 
both numerous and somewhat controversial, and 
it is also true that we have found so many levels 
and aspects of behavior more or less directly re- 
lated to human motivation that the complexity of 
the task of relating the many variables to practice 
may have discouraged many workers in this area. 
The difficulties are many and real. 

Be that as it may, why one teaches relates to 
the ways one teaches, tothe satisfactions he finds 
in his work, and thus directly or indirectly to his 
success. We shall discuss first a frame of ref- 
erence and then relate some of the studies in this 
area to this frame of reference. 


A Formulation of Human Motivation 





Human motivation functions in initiating and 
continuing behavior. Itaffects perception, and is, 
in turn, affected by it. It adds direction and out- 





put. Thus whatever its course, motivation will 
be a factor in decisions about which behaviors will 
be employed, which goals will be sought, howcer- 
tain events are perceived, how long behavior will 
continue, and the degree of satisfaction and conse- 
quent reinforcement of behavioral tendencies 
which result from responses in a situation. 

Shaffer and Shoben (34)* have briefly consider- 
ed some of the explanations of motivation. We 
shall mention a few. 


Postulations of Motivation 








At one time, motivating power was attributed 
to ‘‘instincts’’ (34: 25), so that a pugnatious per- 
son was thought to have aninstinct to fight, the 
hard worker an instinct to,industriousness, and 
soon. Such instincts were thought to be born in 
one, and a person thus merely carried out his na- 
tural tendencies. This sort of belief places people 
at the mercy of their genes, and really does not 
help solve motivational problems. In effect, all 
that is thus said is that someone fights because he 
has a tendency to fight— a descriptive rather than 
an explanatory statement. 

Another hypothesis has been that motivation is 
the result of some kind of inner energy or force, 
psychological in character, which initiates and 
continues behavior. Psychoanalytic theory was 
based on such hypotheses, as with Freud’s gener- 
al life instinct or Jung’s libido, ‘‘a broad drive 
supplying energy for all behavior’’ (34: 26). With 
Shaffer and Shoben, we would argue that the pos- 
tulation of a psychic energy as different from all 
bodily energy is introducing an unnecessary vari- 
able, for it can reasonably be held that so-called 
‘*intellectual’’ or ‘‘emotional’’ behavior is in ac- 
tuality based upon neutral electro-chemical activ- 
ity which is not basically different from that of 
the other behaviors of the body, such as muscular 
movement, sensation, or digestion. Thus such a 
postulate may be useful to some in describing cer- 
tain aspects of human behavior, but there is a lack 
of evidence to show that psychic energy exists or 
is different from bio-chemical energy. 

Motivation may be described as the result of a 


* The reference numbers used in this chapter refer to the numbered bibliography at the end of this par- 
ticular chapter. 





JOURNAL OF EXPERIMENTAL EDUCATION 


need or deficiency in the organism, so that an or- 
ganism lacking in food might be motivated to eat, 
and soon. However, it is pointed out (34: 27) that 
in animal studies, hunger may inc rease activity 
for perhaps four days, but that after this, activ- 
ity decreases. An element of expectation must 
therefore enter the picture, so that motivation is 
not related to need alone, although needs must 
necessarily be a factor. 

We can also examine motivation as reaction to 
stimulus, in that deficiencies in water content of 
the organism, food lack, or the prick of a pin ac- 
tivates the organism through internal or external 
stimulation. It is evident that many internal and 
external stimuli are acting upon one at any given 
moment, so that stimulus patterns may be ex- 
tremely complicated. Some stimuli are persis- 
tent, as in the case of hunger, and require that 
behavior be maintained over a period of time. 
Other stimuli, as perhaps a momentary sound, 
arouse only fleeting activity. Like the concept of 
need, stimulation is important to motivation the- 
ory, yet only partly an explanation (especially in 
regard to human motivation). ; 

Snygg and Combs (36) consider that there is in 
reality only one basic motive, that of striving to 
better cope with or protect oneself against the en- 
vironment, and that ‘‘hunger drive,’’‘‘pain a- 
voidance,’’ etc., are really only individual man- 
ifestations of the basic over-all striving. They 
make much of the role of self perception, and 
suggest that as one perceives himself and the 
situations in which he finds himself, his motives 
will be related to attaining (or maintaining) abili- 
ty to deal with the situation satisfactorily. 

Maslow (28), recognizing biological drives, 
postulates possible instinctive higher order mo- 
tives as well, and also those derived through 
learning and which are largely social. Yet Mas- 
low seems to make most of the tendency of the 
human organism to ‘‘self-actualization,’’ imply- 
ing that one tends to make use of whatever phys- 
ical and intellectual powers he possesses, devel- 


oping them and gaining satisfaction from their 
use. 


Motivation as Viewed for the Present Discussion 





Essentially, human motivation seems to be a 
matter of arousal of the organism, activity being 
in the direction of a goal, ceasing when the goal 
has been achieved. This is in keeping with drive 
reduction theory, in which drives may be defined 
as patterns of persistent stimulation that induce 
sustained activity (34: 28). A drive originates in 
body conditions, such as result from needs for 
food or oxygen. The drive may be a function of 
external stimuli, as when we attempt to avoid 
pain because of the prick of a pin. 





Having been aroused, the behavior of the or- 
ganism is in the direction of reducing the drive, 
or, to put it another way, to remove the stimuli. 

Thus we learn to seek definite goals, and to 
respond accordingly. Precisely how we respond 
is a function of the various internal and external 
stimuli acting upon us, and our learned, and us- 
ually habitual, ways of coping. Itisinthis rather 
basic way that we learn to perceive, finding cer- 
tain elements in the environment that act as cues 
to drive-reducing behaviors. 

As a result of our interaction withthe environ- 
ment, we find pleasant or unpleasant consequen- 
ces, hence drive reduction or drive intensifica- 
tion, or perhaps the initiation of new drives. We 
develop feeling tones or emotions, whichare both 
biological and intellectual in character. That is, 
we both recognize that we are angry or afraid or 
happy, and physical changes accompany this re- 
cognition. Emotions, thus learned from conse- 
quences of behavior, develop along with other ex- 
pectancies of the situation, as for example, when 
we become anxious at the thought of having a tooth 
pulled. Emotions can then become drives, and 
cause us to modify our behavior. 

Although originally motivation can come only 
from drives related to one’s psysiological status, 
one learns or acquires drives of a social nature. 
Through consequences of be havior (pleasant-un- 
pleasant, reward-punishment, pleasure- pain, 
negative-positive reinforcement), we learn toen- 
gage in certain activities and not in others. For 
example, we may learn to be friendly to other 
people, since we have found this sort of behavior 
usually results in pleasant consequences. Or we 
learn to be aggressive, defending ourselves and 
gaining drive reduction of fear, pain avoidance, 
threat to the self. 

Among the socially derived motives we find 
mentioned are motives to seek approval of others, 
to master one’s tasks, to gain attention, to enjoy 
new experience, and manyothers. Eachislearn- 
ed, over a period of time, insofar as it relates 
to earlier drives and drive reduction, which are 
thought to be originally biological in nature. 

Such learning occurs in early childhood, gen- 
eral patterns of approach-withdrawal, submis- 
sion or dominance, particular goals, emotions, 
and other motivational aspects de ve loping early 
as a consequence. (We are not yet entirely clear 
as to the extent that temperament, related to her- 
edity, plays a part throughthe bodily processes.) 
Learning continues throughout life to modify mo- 
tivation, as well as reactive behavior. 

Motives can be described at various levels of 
generalization. Some permeate most aspects of 
our lives, whereas others are more specific to 
given situations. Some appear to represent basic 
personality characteristics, developed early in 





RINGNESS 


life, while others represent more easily modified 
characteristics of the individual. Generalization 
of motives, goals, and responses takes place as 
we learn, and we thus develop broader goals, re- 
act to more varied stimuli, and respond in more 
general ways. 

We learn to internalize, and make our own, 
some of the motives and wishes of others, as 
when a little boy learns to keepclean, or when we 
learn to seek relief from feelings of guilt raised 
because we have accepted mores and values of our 
society. Some motivational patterns change dur- 
ing one’s life, others remaining relatively con- 
stant. It is also probable that no two of us have 
precisely the same systems of motives, likeness- 
es within a culture representing the effects of 
learning. And even within a culture, there are 
individuals whose motives differ widely from the 
‘*norms’’. 

In any situation, one’s motives are related to 
both the drives (stimuli) he possesses at that mo- 
ment, and to the individual habits or response 
tendencies he employs to reduce those drives (re- 
move the stimuli). There are both personality 
predispositions (primarily learned) and momen- 
tary environmental influences, which combine in 
complex ways to influence behavior (34). 

The expectancy value of the immediate situa- 
tion essentially determines how one shall react to 
reduce given drives. If there is ambiguity in the 
situation (two or more expectancies), our motives 
may conflict and behavior become anxious, emo- 
tional, and inefficient. 

Motives may be differentiated ata high level of 
conscious awareness, or with various degrees of 
known goal-orientation. We may employ various 
escape and defense mechanisms, disguising some 
of our motives, even from ourselves. 

Attitudes may be defined as motives organized 
in response to a person, Situation, or institution 
(33: 92). They are a personal evaluative reaction 
and are basically either acceptant or rejectant, 
‘*for’’ or ‘‘against’’ something. They may be of 
varying order of intensity. 

Interests may be defined as intensity of atti- 
tude, so that one may be of favorable attitude and 
highly interested, of neutral attitude and not par- 
ticularly interested, or of some other com bina- 


DRIVES—> 





tion of attitude and interest, degree and direction 
of reaction. We would presumably move reac- 
tively toward something to which our attitude was 
favorable, but react against or withdrawifthe at- 
titude were unfavorable. How strongly we react 
is the result of intensity of interest. Attitudes 
and interests are both learnedfrom consequences 
of responses to situations, drives either being re- 
duced or not. 

Values may be described as generalized atti- 
tudes. Purposes may be defined as symbolic rep- 
resentations of the goals one has in mind, a goal 
being the end activity that term inates motivated 
behavior (as eating may be the goal of behavior 
motivated by hunger stimuli). Our goals are or- 
iginally learned, so that we need to experience 
the consequences of behavior which is motivated 
before we can understand what our goals are. 
For example, the tiny baby does not knowthat 
eating will relieve his hunger drive, and indeed 
does not know that he is hungry. But when he has 
eaten and his drive is reduced, he learns to have 
eating as one of his goals. 

Incentives are external stimuli introduced to 
motivate one. They introduce the element of ex- 
pectancy, and act as potential reward, or as 
goals for the behavior we wishto initiate. They 
need not be related directly to the behavior being 
fostered, but may serve to reduce other drives 
which they may also initiate. Thus we may offer 
money for the digging of a ditch. The digging re- 
sponse does not satisfy the hunger drive , but 
leads to the attainment of the incentive, money, 
which in turn may be used to purchase food which 
will, in turn, reduce the hunger drive. Incentives, 
thus conceived, are intermediate variables in 
drive reduction. 

Diagrammatically, we can represent motiva- 
tion as shown at the bottom of the page. 


Discussion 


It will be noted that several principles have 
been put forth. The firstis that activity is not 
aroused without the presence of stimuli, and that 
truly motivated behavior (not reflexive) requires 
persistent stimuli. Activity results in the seek- 
ing of goals, and the stimuli to which one re- 


ACTIVITY is aroused——> RESPONSES are made—> DRIVES ARE REDUCED 


Removal of stimuli 
Consequences 
Learning 

Further motivation 
Emotions 


Situational 
Habitual 
Learned 

Trial and error 
Emotions 


Biological or social 
Persistent stimuli 
Internal or external 
Emotions 


Expectation 
Goal oriented 
Emotions 
Attitudes 
Interests 
Values 
Purposes 
The Self 





JOURNAL OF EXPERIMENTAL EDUCATION 


sponds, environmental in character, depend upon 
the drive in operation. Responses are either 
learned or habitual, or tentative, the consequen- 
ces of the response resulting in removal of the 
persistent activating stimuli (drive reduction, 
stimulus removal) or not. Although this sounds 
as if all stimuli were unpleasant and the organism 
were in a constant state of atte mpting to escape 
such unpleasantness, this is not a necessary as- 
sumption. We can conceive of motivation in this 
context more as analagous to the electric refrig- 
erator, which has automatic switches connected 
with thermostats. Whenthe refrigerator gets too 
warm, this is made known to the switches by the 
thermostats, and the machine turns itself on. 
When the proper temperature is reached, the ma- 
chine switches itself off. It is not necessary to 
postulate feelings of pleasure or painin regard to 
the stimuli, although it is true that these fre- 
quently accompany motivational states. 

Motives are influenced by the consequences of 
our actions. This principle leads us to consider 
not only expectancies concerning the outcomes of 
our responses, but also the situations in which 
we find ourselves, andfurther, to generalizations 
about ourselves as usually able to cope with our 
environment and needs, or not. Maslow (28) has 
suggested that the dimensions of self-esteem and 
security are extremely important in this regard. 
If one is both high in self-esteem and security— 
confident in his relationship to his environment, 
usually in a social sense-- he islikely to bea 
leader or a strong follower, and his motives and 
their expression through behavior will be differ- 
ent from those of one who lacks self-esteem, se- 
curity, or both. 

Performance, as in teaching, is now seen to 
be a function of several variables: 

a. One’s potential. His knowledge, intelli- 

gence, developmental level, etc. 

b. One’s cognition of the situation. His per- 
ception, based upon knowledge, expectan- 
cy, the situation itself. 

c. One’s motivation. His drives, goals,emo- 
tional status. 

d. The environment itself. As a limiting fac- 
tor in performance. 

e. Personality variables. Temperament, 
habitual behavior (traits, types, etc. ). 

We can now argue a progression in which we 
consider that 

a. the motives, themselves, lead to goal- 
seeking activity; and 

b. one’s expectancies about himself and the 
situation lead to choice of behavior; and 

c. behavior results, as a function of percep- 
tion of the situation; resulting in 

consequences of behavior; which further 
modifies perceptions, and leadsto drive 





reduction, or not. 

Emotional time may accompany all elements 
of this progression, acting as drive, in regard to 
expectancy, and consequence. Consequence leads 
to learning, and to modification of perception. 


Considerations in Studying Motivations of 
Teachers 





When studying the motivations of teachers as 
related to teaching efficiency, then, several ele- 
ments must be considered. These may be group- 
ed under two main headings: 

a. the needs or drives; expectancies or per- 
ceptions; and consequences of teachers, 
as they affect the ways the teacher acts. 

b. the needs or drives; expectancies or per- 
ceptions; and consequences for pupils, fel- 
low staff, parents, administrators, and 
community, of the way the teacher acts. 

The first group will determine how well satis- 
fied the teacher is, and this is relatedto his per- 
formance. The second group determines in part 
how the performance of the teacher, however mo- 
tivated, is evaluated. 

The following specific questions may then be 
asked: 

Regarding the motivation of the teacher— 

1. Why is the teacher interested in teaching? 
How long has he held theseinterests? How strong 
are they? How changeable? 

2. What are the expected benefits of teaching, 
as the teacher sees them? What are his expected 
difficulties or dislikes in teaching? 

3. How satisfied has the teacher been with 
teaching, student-teaching, or similar experien- 
ces? 

4. How legitimate are the teacher’s motives? 
How realistic? 

5. How does he perceive himself? 

6. What are the teacher’s particular goals? 
How are they related to his motives? 

7. How do situational factors and habitual re- 
sponses affect his drive reduction? 

8. Are the teacher’s goals attainable? To 
what extent? 

9. Which drives are reduced and which inten- 
sified through teaching? 

10. Which are related to teaching and which to 
other areas of his life? 

11. What age, sex, or other differences exist? 
How do his motives change? 

12. Do the consequences of his teaching beha- 
vior relate well to his expectations? 

13. What is the hierarchy of his motives? 

14. Are his motives appropriate, ina social 
sense? 
Regarding the ways the teacher responds— 

1. How satisfying to others is the behavior of 





RINGNESS 


the teacher? 

2. Are expectations about the teacher realis- 
tic and reasonable? 

3. Are pupil needs satisfied or hindered 
through his teaching? 

4. Is there conflict betweenthe roles the 
teacher plays and the expectations of others? 

5. Do the teacher’s ways of satisfying his 
needs seem harmful to pupils or others? 

6. Is his effect on all pupils or in all situa- 
tions the same? 

Thus, when we ask about the ‘‘motivation’’ of 
the teacher, we are really asking no question at 
all, for motivation must be defined in terms of the 
variables we have mentioned. 

It is not surprising to find that the teacher’s 
effects are variable; that he can failin one situa- 
tion and do well in another. We are beginning to 
find (12) that the teacher affects various pupils 
differently, and that one’s teaching efficiency is a 
combination of pupil-teacher-situational factors, 
the patterns of which we know very little. 

In our consideration of the findings of various 
studies related to the motivation of teachers and 
teaching success, we shall be aware of the com- 
plexity of the problem. But we may expect cer- 
tain areas of enlightenment, while recognizing 
that much has yet to be studied. We shall group 
our discussion into three headings: 

1. Studies of why teachers chose teaching as 
a profession. 

2. Studies of teacher attributes. 

3. Studies of teacher and others’ satisfactions 
and annoyances. 

We shall recognize that there are two aspects 
to be considered: 

a. Certain motives may be related to certain 
kinds of teachers or to teacher behavior, which 
may or may not be rated as excellent by others. 

b. Satisfaction of teacher motives may or may 
not be present. This, in turn, may affect their 
behaviors and subsequent successes or lacks. 

Both kinds of studies are present in our dis- 
cussion. 


Why Teachers Choose to Teach 





In 1948, Best (3) foundthat persons who chose 
teaching as a profession belie ved teaching to be 
more secure, the profession to beless over- 
crowded, that there wasless physical strain, 
more opportunity for proper home life, more ade- 
quate lifetime income, that it was easier to gain 
the needed education, and that there was less op- 
position from parents and relatives than with cer- 
tain other professions. 

Studies by Corey (5), Cramer (6), Eliassen (7), 
Gould (9), ‘Hanly (10), Harris (11), Hollis (14), 
Larson and Marzolf (20), Lawton (21), Lee (22), 





Orton (24), Seagoe (33), Thurman (38), Tudhope 
(39), Valentine (40), and Yeager (41) tend to indi- 
cate a relatively small number of reasons listed 
by students for the choice of teaching as aprofes- 
sion. Fox and Richey (8) found variables of ‘‘se- 
curity,’’ chance for advancement, good working 
conditions, pleasing environment, chance to serve 
society, work with young people, prestige, per- 
sonal satisfaction, social contacts, andother fac- 
tors as advantages seen by prospective teachers. 
But there were disadvantages, including low pay, 
restricted private life, poor working conditions, 
monotony, parental disapproval, responsibility, 
‘‘nerve strain’’, too many bosses, lowsocial 
standing, lack of prestige, lack of academic free- 
dom, little tenure, politics, and other factors 
seen by those not intending to teach. 

Accordingly, Ringness (31) in 1951 employed 
a composite of reasons from these studies in his 
attempt to see how motivation of teachers was re- 
lated to teaching efficiency. In this study, it was 
postulated that the reasons given for choice of 
teaching as a profession were at the level of Cat- 
tell’s surface traits. It was believed that through 
factor analysis, ‘‘surface trait’’ information, col- 
lected through tests and questionnaires, could be 
reduced to the level of ‘‘source traits.’’ Accord- 
ing to Cattell (4), surface traits may be a com- 
bination of environmental and constitutional as- 
pects of personality, and may be approached by 
observation of behavior and by testing. Source 
traits are deep, underlying traits, and may be 
approached by factor analysis of surface traits. 
Interests are dynamic personality traits. Basic 
drives and acquired interests may be described 
as source traits and surface traits. 

Another way of stating this might be to suggest 
that specific attitudes, goals, or values may be 
reduced to more generalized attitudes, goals, or 
values. It was reasoned that such generalized 
motivational factors might be found manifested 
consistently throughout teacher behavior, and 
hence reflect in teaching success or lack thereof. 

In this study, only secondary school teaching 
candidates were studied, a notable lack, but ne- 
cessary because the training institution did not 
prepare elementary teachers at that time. Sixty- 
three men and 37 women seniors were studied, 
and of these, 16 men and 18 women were visited 
the following year in their first teaching positions. 
Thus, 47 men and 19 women were not visited. 
Although approximately ten had accepted teaching 
positions too far away to be visited, the balance 
were not teaching. A commentary onthe mortal- 
ity rate of the profession is evident. This is also 
a reflection on strength of (reported) motivation, 
and the possibility exists that student reports of 
attitudes were not highly valid, or else that their 
motivations changed radicallywithin the year. 





JOURNAL OF EXPERIMENTAL EDUCATION 


During the period of the study, there was a short- 
age of teachers. 

A paired-comparison test and a ranking ques- 
tionnaire were devised, employing 13 reasons for 
choice of teaching as a career: the beliefs of rel- 
atively good financial reward; ease ofobtaining 
necessary education; security against jobloss and 
layoffs; clean and attractive physical surround- 
ings; short working hours and frequent vacations; 
opportunity for professional advancement; oppor- 
tunity to serve society; prestige and respect of the 
profession; ease of obtaining a position; opportun- 
ity to pursue a favorite (subject matter) interest; 
variety of activities and little monotony; environ- 
ment of interesting co-workers; and relative lack 
of physical strain. It is interesting to see that 
some of these ‘‘reasons’’ reflect popular stereo- 
types, but not reality. For example, ‘‘frequent 
vacations”’ often are, to the practice teacher, a 
chance to catch up on work in which he has drop- 
ped behind. Summer vacations are seen more 
comparable to factory lay-offs, since one is not 
paid for such time off. The‘‘short hours’’ do not 
include the grading of papers, preparing lessons 
at home, etc. 

Both the paired-comparison test and ranking 
questionnaire employed the same items, since the 
circularity of the first instrument might be pre- 
sumed to offset the probable inequalities of dis- 
tance in the second. Both have the disadvantages 
of any self report scale, so that we should not be 
surprised to find differences in patterns of choice 
on the two instruments as used by any one student. 

Such indeed was the case. The resulting fac- 
tor analyses provided no useful data, so the re- 
sults will not be considered here. However, in 
terms of raw data, the paired-comparison test 
and ranking questionnaire provide considerable 
interest and show consistency. In order of rank, 
the reasons for choice obtained from the paired- 
comparison test are shown below. 


MEN 





A variety of comments may be made at this 
point. We may first raise the question of whether 
we are getting at the real reasons for choice of 
teaching as a profession, or whether we are get- 
ting an echo of the popular stereotype of teaching. 
We note sex differences, as expected, in the val- 
ue systems of men and women. We also note the 
disdain with which the possibly insulting last three 
items were treated. 

The ranking questionnaire produced essentially 
the same results. Differences in factor analytic 
results appeared linked to the patterns of choice 
of the particular subjects, rather than the over- 
all order of items as given by the group as a 
whole. 

In comparing the teaching profession with 13 
other occupations, those of medicine, law, busi- 
ness management, sales, journalism, engineer- 
ing, the ministry, social work, advertising, lab- 
oratory research, accounting, farming, enter- 
tainment (the stage), teaching was ranked by these 
subjects as is shown at the top of the next page. 

The teaching profession does not come off 
poorly except in the area of financial reward. It 
is also evident that at least at this training insti- 
tution, the teaching preparation is not regarded 
as a program of ‘‘snap’’ courses. Thus, teaching 
is thought to provide attractive surroundings, the 
work is not arduous, there is opportunity to pur- 
sue a favorite subject matter interest, there are 
interesting co-workers, and plenty of variety.’ In 
a word, teaching is a pleasurable occupation. 

From a check-list and from autobiographies, 
it seemed that influences on the choice of career 
were largely those of individual study of one’s 
own interests and capabilities, and the urgings: of 
teachers and/or parents. Major interest in teach- 
ing seemed to have been long-lived for most sub- 
jects, and centered in the subject matter area, 
rather than in children as such. It should be not- 
ed that there were no elementary teachers in the 


WOMEN 





Opportunity to pursue favorite (subject matter) 
interest. 

Opportunity to serve society. 

Variety of activities, little monotony. 

Security against layoffs, job loss. 

Environment of interesting co-workers. 

Prestige and respect of profession. 

Clean, attractive physical surroundings. 

Opportunity for professional advancement. 

Short hours, frequent vacations. 

Financial reward. 

Ease of getting necessary education. 

Ease of obtaining position. 

Lack of physical strain. 





Opportunity to serve society. 

Opportunity to pursue favorite (subject matter) 
interest. 

Environment of interesting co-workers. 

Variety of activities, little monotony. 

Prestige and respect of profession. 

Security against job loss, layoffs. 

Clean, attractive physical surroundings. 

Financial reward. 

Short hours, frequent vacations. 

Opportunity for professional advancement. 

Ease of getting necessary education. 

Ease of obtaining position. 

Lack of physical strain. 





RINGNESS 


Financial reward 

Ease of getting education 
Security against job loss 
Attractive surroundings 
Working hours, vacations 
Professional advancement 
Service to society 
Prestige, respect 

Ease of getting position 
Opportunity for favorite interests 
Variety of activities 
Interesting co-workers 
Lack of physical strain 


group. 

During the second year of the study, teachers 
were assessed in the field, using the criterion of 
‘‘acceptability’’ as a measure of success. This 
is defined as ‘‘the quality of the teacher which 
leads others to feel that she should be retained in 
position or possibly promoted,’’ and is defended 
as taking into account the varying expectancies, 
duties, roles, values, and other variables of the 
teacher and her teaching environment. Teachers 
were rated by two University observers and by the 
teachers’ supervisors, a composite acceptability 
rating being the result. 

Multiple correlations betweenitems of the 
choices instruments and acceptability ratings were 
.76 for men and .78 for women. These are not 
high multiple R’s, but showsome relationship be- 
tween reasons for choice of teaching and teaching 
success. That is, the belief that teaching offers 
opportunity to serve society and to pursue a fa- 
vorite interest, and the finding that teaching is in- 
teresting in a variety of ways seems more likely 
to differentiate successful teachers than beliefs 
in teaching as an easy, secure sortof occupation. 
This does not surprise us, for the first motiva- 
tions listed are demonstrations of positive attitude 
and interest, and should be related to vigorous, 
willing workers, whereas thelatter are more 
nearly related to low level of aspiration, lacka- 
daisical attitude, and even choice of occupation by 
default. 

In view of changes between 1949 and the pres- 
ent, we raise several questions: Is the perception 
of teaching as a profession the same today, con- 
sidering changes in salary schedules, licensing 
requirements, new buildings, teacher t raining 
changes, changes in teaching objectives, etc. ? 
Is the kind of student in teacher training the same 
as he was in 1949-50? Are teachers from state 
or county colleges, and at elementary school lev- 
els the same as the subjects in this (univer sit y) 
study? What effect have assumed over-crowding 
of the classroom, emphasis onpre-college train- 





= 
= 


eee Pe eH woe 
PNK KR ONN UH KH Or 


ing for high-school pupils, increased homogene- 
ity in pupil grouping, teaching machines, televi- 
sion, and the lay press attitudes had on teaching 
as a profession, and teacher motivation in partic- 
ular? (In dealing with elementary teaching candi- 
dates, the writer has been struck with the num- 
ber who prefer to work with primary level child- 
ren rather than those of the upper grades, be- 
cause ‘‘they are still nice and not so hard to live 
with.’’) Society as a whole has changed, along 
with the teaching profession. We now need a new 
look at the inter-relationships involved. 

A later study (16) examined responses of high 
school seniors to a set of structured situations 
concerning teaching as acareer. ‘‘Projective’’ 
anecdotes were employed, andsubjects were 
studied using the Bell Adjustment Inventory and 
the Strong Vocational Interest Blank. The father’s 
occupation was also considered. Of these stu- 
dents, the following percentages described teach- 
ing unfavorably: 

Low pay 29% 
Discipline 22% 
Long hours 18% 
Dull, boring 14% 
Limited opportunity 14% 
Lack of ability 12% 
Too much education required 8% 

These students were obviously considering not 
only their own motives and abilities, but the ex- 
periences they have had in school. On the other 
hand, favorable responses were as follows: 

Like children 52% 
Help society 48% 
Interest in people 19% 
Security 16% 
Satisfaction in watching 

children grow and learn 16% 
Good people to work with 54% 
Good hours 32% 
Job satisfaction 27% 

However, in the response to ambiguous ‘‘pro- 
jective’’ device situations, there was a tendency 





JOURNAL OF EXPERIMENTAL EDUCATION 


to describe teachers as ineffective and unattrac- 
tive. Possibly this represents the stereotyped at- 
titude of pupils to teachers, but since choice of 
career is often made atthistime, a need for 
changing such stereotypes is indicated. 

The Bell, Strong, and (used in oneinstance) 
Kuder tests were found ineffective in relation to 
the projective measure employed. 


Studies of Teacher Attributes 





Hellfritzsch (13) attempted factor analysis ofa 
number of measures frequently used in investi- 
gating the nature of teaching ability. Employing 
teachers of social studies in non-departmentaliz- 
ed eighth grade rooms during one year and in one- 
room rural schools during a second year, he first 
looked at the characteristics of pupils. These 
were given pre- and post-tests of social studies, 
and reading and intelligence tests were employed 
to examine possible differences in these forms of 
readiness. Pupil gains onthesocial studies tests 
were developed into an index of teaching efficiency. 
Teachers were given tests of intelligence, know- 
ledge of subject matter, personality, professional 
aptitude, social attitudes, and other special tests. 
They were also rated by their supervisors on var- 
ious scales. 

Certain factors emerged, not unexpectedly, 
considering the instruments employed. These 
were a factor of general knowledge and mental 
ability; of personal and social adjustment; of ‘‘eu- 
logizing attitude toward the teaching profession’’; 
and of teaching ratings by supervisors. Inconsid- 
ering the special factor of supervisory ratings, it 
was shown that such ratings were not related to 
pupil gain, knowledge of the subject, mental abil- 
ity, or social and emotional adjustment, but were 
apparently, generally speaking, divorced from 
any of the ‘‘objective’’ measures employed. It 
was hypothesized that supervisory ratings must 
have been related to the individual ideas of the 
rater as he observed relatively minor teacher- 
pupil behavior. We may consider whether the 
raters in this study had definite behaviors they 
were looking for, and if so, what the basis for 
choice of such specifics might have been. Was 
the training of the raters adequate? Reliable? 

The ‘‘eulogizing attitude’’ toward the teaching 
profession appears related to motivation and job 
satisfaction. The general tenor was that ‘‘teach- 
ers are good--pupils are good’’—a rather eu- 
phoric attitude. This could also be related to the 
mental health of the teacher. 

In the first year of the study, Hellfritzsch con- 
cluded that pupil gain was most highly correlated 
with the general knowledge and mental ability of 
the teacher. 

In the second year of thestudy, a fifth factor, 





that of ‘‘teaching ability’’ was isolated, However, 
pupil gain in the one-room rural schools seemed 
related only to the teacher’s liberal social beliefs 
and to sound working knowledge of symptoms, 
causes, and remedies of pupil adjustment difficul- 
ties. Differences were found in the common fac- 
tor composition for the four-room state graded 
school teachers and the one-room rural teachers, 
not surprising in view of the differences in the 
roles of these teachers. ‘‘Teacher factors’’ were 
found more important than ‘‘pupil factors’’ in the 
state graded schools, contrary to results for the 
one-room rural schools, probably because in the 
rural schools pupils were left more to study ‘‘on 
their own,’’ each pupil actually receiving less 
teacher guidance by comparison. 

Hellfritzsch further emphasizes that although 
he was able to relate mental ability and subject 
matter knowledge in the one study, and social at- 
titudes and understanding of pupils intie other, to 
pupil gain, that no single factor could be consider- 
ed important by itself in influencing such gains. 

In effect, we have in this study an ite ration of 
the complexity of the task of relating ‘‘teacher 
success’’ to some hopefully predictive variable 
related to the teacher or his training. It is ap- 
parent that teaching efficiency is a function of his 
particular role, and the extent of his influence is 
not easily ascertained. That even when ‘‘measur- 
ed’’ by such an apparently objective criterion as 
pupil gain, we cannot relate efficiency to any given 
teacher variable. It is evident that what we have 
been doing in many studies is adding predictive 
measure after predictive measure, hoping that the 
composite may be more closely related to teacher 
success than any one instrument, and that we may 
reduce the manifold dimensions of the composite 
by turning to factor analysis. Yet such analysis 
can only reduce the elements employed in the or- 
iginal measurements, and if these do not contain 
the required hypothetical relationship to teaching 
efficiency, this sophisticated statistical technique 
will not help us. A few large multiple correlation 
coefficients would perhaps be more useful, but we 
do not have these either. 

Yet Hellfritzsch’s study does show a factor of 
‘‘eulogizing the profession, ’’ in part the result of 
a teacher attitude inventory, and may suggest that 
such attitudes ought to be considered infuture 
studies of prediction of teaching efficiency. 

Insofar as personality may be related to moti- 
vation through such variables as perception, level 
of aspiration, general surgency or desurgency, 
Lamke’s (19) study is probably indicative of what 
we may expect. Using a paired-comparison test 
based on Cattell’s 20 surface traits and factoring 
the results, Lamke relatedthese to the same ‘‘ac- 
ceptability’’ criterion of teaching efficiency em- 
ployed by Ringness, and, infact, used the same 





RINGNESS 


subjects. Lamke, Ringness, and Bach were the 
university interviewers and raters during the fol- 
low-up of these subjects in their first positions. 

Lamke found some relationship between good 
and poor teachers and tests of surgency vs. de- 
surgency, adventurous cyclothemia vs. inherent 
withdrawn schizothemia, and sophistication vs. 
rough simplicity. Thus, good teachers were 
found to be more talkative, cheerful, placid, 
frank, and quick, while poor teachers were more 
silent, depressed, anxious, languid, and uncom- 
municative. Good teachers werefoundto be more 
gregarious, adventurous, frivolous; to have an a- 
bundance of emotional responses, strong artistic 
or sentimental interests, andto be interested in 
the opposite sex, but poor teachers were found to 
be more shy, cautious, conscientious, lacking in 
emotional response, lacking in artistic or senti- 
mental interests, and to have only slight interest 
in the opposite sex. Good teachers were found 
average in tendencies to be polished, fastidious, 
and ‘‘cool’’, but poor teachers were below aver- 
age. The poor wereclumsier, more easily 
pleased, and more attentive to people. But, con- 
cludes Lamke, a balance among personality traits 
is more likely to be related to teaching success 
than extremity on any variable. Too much of any- 
thing may be as debilitating as too little. 

This leads us to say that possibly the poor 
teachers, as found by Lamke, had had a lesser 
degree of need satisfaction than the good teachers, 
and hence were motivated more toward security, 
caution, and rather repressed, conservative be- 
havior. Apparently such teachers would be more 
interested in the security of the profession, rather 
than the stimulation of pupil and staff contact. 

Singer (35) studied the social competence of 
teachers, employing tape recordings, teacher 
self-ratings, and ratings by others in estimating 
success. He employed certain scales of the MM 
PI, together with social distance scales of teach- 
er-teacher, teacher-student, and student-teacher 
variables. He found that there was some rela- 
tionship between teaching success and social com- 
petency, and between leadershipin the classroom 
and leadership among the staff. 

But self-perception of how one is accepted by 
colleagues or students, according to Singer, bears 
little relationship to success in teaching. Appar- 
ently the teacher himself, and others, are looking 
for different things when evaluating teaching. 

While this study is only tangentially related to 
the motivation of teachers, it does show that the 
need satisfaction of the teacher may bear small 
relationship to the reality of the situation, since 
his self perception and that of others differs con- 
siderably. Further, it is evident from the lead- 
dership and social competence aspects of the 
study that those with strong motives (and ability) 


to lead and to interact with others tend to be the 
more successful. 


Studies of Teacher Satisfactions and Annoyances 








It is readily understood that if one enters the 
teaching profession for certain reasons, and, ei- 
ther because of pupil, administrative, community, 
or other situational inadequacies, he is not able 
to satisfy his needs, his moti vations will either 
have to change, or his happiness, and hence his 
efficiency. Not only may situational variables re- 
lated to the position be like or unlike the expec- 
tancies of the teacher, but it is also true that the 
teacher may espouse a different set of values 
from the administrator or the community, and 
therefore be classified as inefficient because of 
these differences in expectancies. 

This relationship is complex, and may lead us 
into areas of the training of the teacher, sex dif- 
ferences, age differences, personality, values, 
and other relationships, plus the whole host of 
community and situational variables hindering or 
enhancing the teacher’s work. Worse, who will 
the teacher attempt to satisfy? Himself alone? 
Some of his pupils? Certain parents? The ad- 
ministrator? For concepts of ‘‘the good teacher’’ 
may vary greatly, even within one of these groups. 

Thus we may either have teachers whose needs 
are unsatisfied, or who fail to satisfy others, due 
at least in part to the nature of their motives. In 
either case, such a teacher may not be very ef- 
fective. A teacher may make good in one situa- 
tion and not in another. AsBarr indicates (2), 
‘It is perfectly clear to the investigators that 
teaching in the modern school involves much more 
than the guidance of learning activities. It in- 
volves many important teacher-pupil relation- 
ships, teacher-teacher relationships, teacher- 
administrator relationships, and teacher-com- 
munity relationships; and many important re- 
sponsibilities growing out of these. These rela- 
tionships will limit in a significant respect the 
teacher’s success in a given situation and ulti- 
mately pupil growth and achievement. ”’ 

Having considered the work of many persons, 
such as Mansen (25), Hoppock (15), Lichliter (23), 
Peck (30), Speers (37), andMcCluskey (29), Mar- 
tindale (27) decided to see whether certain teach- 
ers were ‘‘correctly placed, ’’ using a satiafaction 
score from a teacher attitude inventory and the 
relationship of this to efficiency. In this study, he 
compared teacher satisfaction scores with rating 
on the Wisconsin adaptation of the M-Blank, as 
an estimate of teacher success. He found that 
76% of his subjects were more satisfied than dis- 
satisfied or neutral, and that onl y 4% were defi- 
nitely dissatisfied. Seventy-five percent of the 
sample found living conditions satisfactory, the 





JOURNAL OF EXPERIMENTAL EDUCATION 


community friendly, and the education philosophy 
of the administration satisfactory. Fifty percent 
found churches, meal accommodations, commun- 
ity demands, administrative policies, the curric- 
ulum, teachers’ meetings, instructional mater- 
ials, the faculty, teaching load, and salary satis- 
factory. (It must be remembered thatthe sample 
was voluntary and consisted of first-year teach- 
ers. These may have tended to be more satisfied 
than experienced teachers. ) 

The main area of dissatisfaction was with cul- 
tural opportunities in the community where teach- 
ing. Interestingly, interms of family background, 
those who had been brought up with less privilege 
were least satisfied--did they expect too much in 
terms of social mobilityfromteaching? Further, 
since dissatisfaction tended to be related to the 
lack of cultural opportunities, such as museums, 
zoos, the theater, did this really reflecta sort of 
‘homesickness for the campus’’? 

There was found a low (.19) positive correla- 
tion between teacher satisfaction andsuccess. 
But, as Martindale points out, ‘‘this study was 
limited to the more tangible factors; the signifi- 
cance for teaching efficiency might lie more in 
attitudes, values, the prestige factor, etc....”’ 
Case studies presented tendto show that satisfied 
teachers have lived stable, desirable lives, while 
the dissatisfied teachers had various trauma— 
parents, moving frequently, early school year 
problems, etc., —our conclusion, rather than that 
of Martindale. 

Knox (18) considered the environmental press 
on teachers. He found that apotentially upsetting 
condition may be present, but that a particular in- 
dividual may, because of his temperament or 
philosophy of life, not be upset by it. He, too, 
found more teachers satisfied than not, most 
feeling reasonably happy with administrative per- 
sonnel, procedures of the faculty, the particular 
teaching station, school or ganization, the com- 
munity in general, instructional materials, the 
plant, the people of the community, and the stu- 
dents. The teachers were least satisfied withthe 
attributes of the students. A low correlation be- 
tween teacher satisfaction andefficiency was 
found. 

Kline (17) in 1949 studied satisfaction and an- 
noyance in teaching, finding annoyances relative 
to such matters as wanting and being expected to 
be knowledgeable about the individual children, 
providing for individual differences, and apparent 
feelings of restriction about the curriculum and 
about extra-class or extra-teaching duties. More 
seemed frustrated by not being able to accomplish 
what they desired than by inability to fulfill expec- 
tations of others. There were sex differences, 
more men reporting annoyances than women; dif- 
ferences in marital status, more married teach- 





ers reporting annoyances, perhaps due to school 
duties interfering with home plans; and it was 
found that the less the education of the teacher 
(and presumably the younger?), the more annoy- 
ance. Teachers in small com munities reported 
more annoyances than those of larger size. 
Those with most experience reported most annoy- 
ance. Does this also relate to professional inse- 
curity, to less education, to changes in expecta- 
tions over the years, or differences in adminis- 
tration, or in the pupils? Those without military 
experience were more annoyed than those with 
such experience, suggesting c hanges in outlook 
and basis for comparison. To the ex-military, 
teaching may seem more important and attractive. 

Many hypotheses suggest themselves but few 
inferences may be drawn. There arealso the us- 
ual problems of instrumentations and criteria of 
efficiency tobe considered. Further, we find 
satisfaction as much related to personal as to 
professional life. For these reasons, we may 
hypothesize that the reasons teachers teach seem 
reasonably realistic and that their needs seem 
reasonably satisfied. It isprobable thatfor those 
who do not meet reasonable satisfaction, these 
people move to other positions in or without 
teaching. Again, as data seem to indicate, there 
must be some relationships between motivation 
and teaching success, but it is involved in too 
complex a manner to be yet well understood. 

There remains one other category of study for 
us to consider—that of how well teachers are 
judged to meet role expectancies. That this is 
related to the teacher’s motivation is evident, in 
that if the motives of the teacher and the expec- 
tancies of the community (or administration) are 
congruent, it is more likely that the teacher will 
be found acceptable; but if the teacher and com- 
munity are emphasizing different values or are 
working at cross purposes, the teacher will be 
less likely to be adjudged adequate. 

Manwiller (26) found that in general teachers 
and school board members seemed in agreement 
regarding behaviors expected of teachers. Re- 
ligious life was the major area of agreement, 
civic and social life next in agreement, and per- 
sonal-family life and social and recreational life 
least. Distinction between what was expected of 
teachers and of others in the community were 
found. In contrast to an earlier study, the social 
roles of men and women were found much alike. 

Grotke (8a) em ployed the concept of ‘‘profes- 
sional distance’’ based on Bogardus’ postulation 
of social distance. Grotke felt that frequency and 
degree of divergency between the teacher and ad- 
ministrator on what constitutes the professional 
role of a good teacher might relate to how the 
teacher was evaluated. The more discrepant the 
points of view, the longer the professional dis- 





RINGNESS 


tance was adjudged to be. Again wesee relation- 
ship to motivation in the form of attitudes, values 
and expectancies. 

In Grotke’s study, certain ambiguous state- 
ments concerning professional position (attitudes) 
on various problems were used. He may, how- 
ever, be criticized in using admittedly ambiguous 
and therefore probably professionally neutral i- 
tems, and in working with relatively surface atti- 
tudes and postulates of behavior. It is not sur- 
prising, therefore, to find that the hypothesis in 
this instance was not supported. 


Conclusion 


We have attempted to analyze the role of moti- 
vation of teachers in teaching success or efficien- 
cy, and have considered several aspects:success 
as a practicing teacher; his satisfaction with 
teaching and this satisfaction as related to teach- 
ing success; and his motivations and responses as 
seen through the expectant eyes of others. 

Nowhere have we found relationships which are 
unambiguous and useful for other than minimal re- 
quirements in prediction. Apparently certain at- 
tributes are a necessary but not sufficient condi- 
tion for teaching success, and some of these are 
apparently related to motivation. Other factors 
are in the abilities and other characteristics of 
teachers, in the community, in the administra- 
tion, the pupils, and the area of per sonal-family 
life. The criteria of teaching success were also 
found wanting. And it may also be assumed that 
‘‘teaching’’ and ‘‘teaching success’’ are variously 
defined or have many different meanings, and 
permit no one overall characterization. For these 
and other reasons, our studies are not as defini- 
tive as we could wish, and much work might well 
be done in this area. Accordingly, supervisors 
and administrators are unable to judge the success 
of teachers with any high degree of accuracy, 
whether in relation to pupil gain, personality, or 
Other criteria. Apparently expectations, profes- 
sional distance, recency of training, age, size of 
community, sex, and other factors cause us to 
lose sight of basic relations between motivation 
and teaching success. 

For the administrator we have several com- 
ments: 

a. It would be well to examine the nature and 
realism of prospective teacher expectancies. 

b. It is useful to check theteacher’s attitudes, 
objectives, and emphases. 

c. The teacher’s satisfaction with his job is 
not only worthy of study intermsofhis efficiency, 
but in terms of recruitment and permanency of his 
career. 

d. The ‘‘standards’’ by which we rate teachers 
need much more clarification. 





For the researcher, it would appear that much 
needs to be done in the area we have been discuss- 
ing, and there are atleast four possible ap- 
proaches worth considering: 

a. We may need some ‘‘depth interviews, ’’ or 
other clinical techniques todiscover motives, 
goals, and satisfactions of specific teachers. 

b. We may employ more use of analysis of 
variance, using teachers with multiple and com- 
parable classes, to attempt to rule out pupil and 
cultural variables and get closer tofinding the 
real affects of the teacher. 

c. We may spend more time onthe classroom 
teacher-pupil interaction, to see what is actually 
going on. 

d. We may pay more attention to pupil gain 
as a criterion of success, the gains needing study 
over time, rather than ‘‘spot checking’’ as we us- 
ually do. 

In teacher education, recruitment, orientation, 
placement, and in-service training, we may well 
consider teacher motivations more carefully, to- 
gether with their meanings and possible conse- 
quences. We must consider further the choice of 
goal, choice of response, and subsequent drive re- 
duction. Anxiety, reward, the self concept, role 
expectancy, stereotypes— all need further clari- 
fication. This without forgetting intelligence, 
other abilities, training, and other pre-determin- 
ing factors. Perhaps more comprehensive coop- 
erative research between schools, universities, 
the government, and certainprivate agencies is 
indicated. There is some indication of a need for 
an inter-disciplinary approachto these problems. 


REFERENCES 


Atkinson, J. W., ‘‘Personality Dynamics, ’’ 

Annual Review of Psychology, R. F. Farns- 
worth, Ed. Palo Alto, California, Annual 
Review, Inc., 1960. 

Barr, A. S., et al., “‘Validity of Certain In- 
struments Employed in the measurement of 
Teaching Ability,’’ The Measurement of 
Teaching Efficiency, New York: MacMillan 
1935. 

Best, J. W., ‘‘A Study of Certain Selected 
Factors Underlying the Choice of Teaching 
as a Profession,’’ Journal of Experimental 
Education, 1948. 

Cattell, R: B., Personality, New York: Mc- 
Graw-Hill, 1950. 

Corey, S., ‘‘Attitudes Toward Teaching and 
Professional Training,’’ Education Admin- 
istration and Supervision, ’ “ yoel-7, 




















JOURNAL OF EXPERIMENTAL EDUCATION 


Cramer, W., ‘‘Vocational Attitude Survey of 
Seniors in 28 Ohio Schools,’’ School and So- 
ciety, 1949, 67, 462-3. 

Eliassen, R. H., ‘‘Recruitment for Teacher 
Training, ’’ Reviewof Educational Research, 
1937, 7, 247-52. 

Fox, W. H., and Richey, R. W., ‘‘An Anal- 
ysis of Various Factors Associated with Se- 
lection of Teaching as a Vocation, ’’ Indiana 
University School of Education Bulletin, 
1948, 24, 1-58. 

Grotke, E. M., A Study of the Professional 
Distance Between Raters of Teachers and 
Teachers Rated, unpublished Ph.D. dis- 
sertation, University of Wisconsin, 1952. 

Gould, G., ‘‘Motives for Entering the Teach- 
ing Profession,’’ Elementary School Jour- 
nal, 1934, 35, 95-102. 

Hanly, F. W., ‘‘Attitudes of High School 
Seniors Toward Education, ’’ School Review, 
1939, 47. 

Harris, R. P., ‘‘Students’ Reactions to the 
Education Profession, ’’ Education Adminis- 
tration and Supervision, 1946, 32, 513-20. 

Heil, L. M., Characteristics of Teachers 
Related to Children’s Progress, Summary 
of report to American Education Res earch 
Association, Chicago, February 25, 1961. 

Hellfritzsch, A. G., A Factor Analysis of 






































Teachers’ Abilities, unpublished Ph. D.dis- 
sertation, University of Wisconsin, 1948. 

Hollis, E. V., ‘‘Why They Teach,’’ Educa- 
tion Administration and Supervision, 1929, 
15, 678-84. 

Hoppock, R., Job Satisfaction, New York: 
Harpers, 1935. 

Johnson, A. H., The Responses of High 
School Seniors to a Set of Structured Situa- 
tions Concerning Teaching as a Career, un- 
published Ph.D. dissertation, University of 
Wisconsin, 1956. 

Kline, F. I., Satisfactions and Annoyances 
in Teaching, unpublished Ph.D. diss erta- 
tion, University of Wisconsin, 1949. 

Knox, W. B., A Study of the Relationships of 
Certain Environmental Factors to Teaching 
Success, unpublished Ph.D. dissertation, 
University of Wisconsin, 1953. 

Lamke, T. A., Personality and Teaching 
Success, unpublished Ph. D. dissertation, 
University of Wisconsin, 1951. 

Larson, A. H., and Marzolf, S. J., ‘‘Atti- 
tudes of Teachers College Students Towards 
Teaching, ’’ Education Administration and 
Supervision, 1943, 29, 434-8. 

Lawton, J. A., ‘*A Study of Factors Useful 
in Choosing Candidates for the Teaching 
Profession,’’ British Journal of Educational 


Psychology, 1939, 9, 131-44. 









































Lee, A. S., ‘‘Motives of High Sc hool Grad- 
uates for Entering the Profession of Teach- 
ing,’’ School Review, 1928, 36. 

Lichliter, M., ‘‘Social Obligations and Re- 
strictions Placed on Women Teachers,”’ 
School Review, 1946, 54, 14-23. 

Orton, D. A., ‘*‘What Attracts College Stu- 
dents to Teaching?’’ Education Administra- 
tion and Supervision, 1948, 34, 237-40. 

Manson, G. E., ‘‘Occupational Interests and 
Personality Requirements of Women in Bus- 
iness and the Profession,’’ Michigan Busi- 
ness Studies, 1931, 3, 281-404. 

Manwiller, L. V., Expectations Regarding 
Teachers,’’ Journal of Experimental Edu- 
cation, XXVI, June, 1958, 315-54. 

Martindale, F. E., Situational Factors in 
Teacher Placement and Success, unpublish- 
ed Ph.D. dissertation, University of Wis- 
consin, 1951. 

Maslow, A. H., Motivation and Personality, 
New York: Harper, 1954, Chapter5. 

McCluskey, H. Y., andStrayer, F. J., 
‘Reactions of Teachers tothe Teaching 
Situation— A Study of Job Satisfaction,”’ 
School Review, 1940, 48, 612-23. 

Peck, L. A., ‘‘A Study of Adjustment Diffi- 
culties of a Group of Women Teachers,’’ 
Journal of Educational Psychology, 1936, 
27, 401-16. 

Ringness, T. A., ‘‘Relationships Between 
Certain Attitudes Toward Teaching and 
Teaching Success,’’ Journal of Experimen- 
tal Education, 1952. 

Rostker, L. E., Measurement and Predic- 
tion of Teaching Efficiency, unpublished 
Ph.D. dissertation, University of Wisc on- 
sin, 1939. 

Segoe, M., ‘‘Some Origins of Interest in 
Teaching, ’’ Journal of Educational Re- 
search, 25, 673-82. 

Shaffer, L. F., and Shoben, E. J., Jr., The 
Psychology of Adjustment, Boston, Hough- 
ton Mifflin, 1956, Rev. 

Singer, A. J., Jr., Social Competence and 
Success in Teaching, unpublished Ph. D. 
dissertation, University of Wisconsin,1954. 

Snygg, D., and Combs, A., Jr., Individual 
Behavior, New York: Harper, 1959, Second 
Edition. 

Spears, H., ‘‘What Disturbs the Beginning 
Teacher,’’ School Review, 1945, 53, 458- 
63. 

Thurman, C. H., ‘“‘Teaching Interests of 
Students in East Texas State Teachers Col- 
lege,’’ Peabody Journal of Education, 1948, 
29, 149-50. 

Tudhope, W. B., ‘‘Motives for the Choice of 
the Teaching Profession by Teachers Col- 






























































RINGNESS 


lege Students,’’ British Journal of Educa- 41. Yeager, T., An Analysis of Certain Traits 
tional Psychology, 1944, 14, 129-41. 


of Selec ted High School Seniors Interested 
Valentine, C. V., ‘‘AnInquiry asto Reasons 








40. 


in Teaching, Teachers College, Columbia 
for the Choice of the Teaching Profes sion 
by University Students,’’ British Journal of 
Educational Psychology, 1934, 4, 237-59. 


University, Contributions to Education, No. 
660, New York: Teachers College, Colum- 
bia University, 1935. 














JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, September 1961) 


CHAPTER XI 


SOME ASSUMPTIONS, EXPLICITLY AND IMPLICITLY MADE, 
IN THE INVESTIGATIONS HERE SUMMARIZED 


D. A. WORCESTER 


In all studies of teaching effectiveness, certain 
assumptions have to be made, although not all of 
them have been explicitly stated. One way to ex- 
amine the fruitfulness of the studies here sum- 
marized is to note these assumptions and then try 
to determine the extent to which they have been 
verified. Some of the assumptions are in terms 
of general characteristics ortraits. Some are 
quite specific. Many of the apparently general 
factors involved are in reality composites, pos- 
sibly very complex composites. These will need 
to be examined in greater detail. Itis customary, 
for example, to speak of the efficient teacher, and 
we have many studies of teaching efficiency. 
What are some of the assumptions found in these 
investigations ? 


General Teaching Ability 





One hypothesis is thatthereis a general 
teaching ability. It is frequentlyimplied that 
some persons are just born withit. Believing 
this, some feel, as William James did about in- 
stincts, that it is not profitable to investigate fur- 
ther. All we need to do is to watch behavior to 
know whether or not an individual is effective as 
a teacher. Others who, perhaps, also accept the 
idea of a general ability toteach, think that we 
will identify the good teacher not by watching her 
behavior but by watching the pupils. The evidence 
of effective teaching is the change produced in 
pupils. In either case, there are various as- 
sumptions and implications involved. 

For example, in two studies (45, 66), the best 
teacher was defined as the one whose pupils made 
the greatest gains in knowledge of certain units 
of a particular subject; and attitudes, interests 
and appreciation in the general area of that sub- 
ject, taken singly or as composites. These were 
beginning teachers, graduates of the same insti- 
tution, in self-contained classrooms, in similar 
communities, who had volunteered to aid in the 
investigation. The pupils wereinthe same grade 
and of similar socio-economic background. Ap- 
propriate attention was given to mental ability, 
age, sex and previous educ ational achievement. 
Now what implications are present? 

One is that there is sucha thing as agood 
teacher. It should be noted that the investigators 





disclaim any attempt to generalize beyond their 
present studies, but there is the almost inevitable 
implication that the one whose pupils make sig- 
icant gains is a ‘good teacher.’’ Otherwise, why 
measure such effectiveness? No one is going to 
employ a teacher just to teach two three-week 
units during the year to a single class. 

There are two real values in this type of re- 
search. If one can discover how to measure ef- 
fectiveness in a limited situation, he will be able, 
it is assumed, to apply the technique quite gener- 
ally. That at least academic gains can be ade- 
quately measured seems to have been well dem- 
onstrated. Secondly, if, after identifying the 
teacher whose pupils make the greater gains, one 
can determine the characteristics of that teacher 
or the particular elements of her teaching beha- 
vior associated with pupil gains, it may be found 
indeed that these are factors in all good teaching. 
It may be possible to predict te aching efficiency 
in general. Although much excellent work has 
been done, the number of im plications has been 
so great that a good deal more research is need- 
ed before these desired predictions can be made. 
What are some of these implications? 


Effective in all Phases of the Subject 





It is implied, for example, that the one who is 
effective in teaching the particular units covered 
in the study will be a good teacher in other units 
of the same subjects. This is a fairly logical in- 
ference, although subject to a degree of doubt if 
some units involve, let us say, demonstration. 


Effective in Other Conditions 





It is implied, also, that the teacher who is 
successful.in the conditions of this investigation 
would be successful in teaching the same units 
under other conditions. Thisis a doubtful infer- 
ence. In some researches reported, teachers 
knew ahead of time what unit was to be taught for 
the experiment, what were the objectives sought, 
the length of the unit, that the class was to be 
tested before and after teaching, and so on. It is 
a plausable hypothesis that in this institution the 
‘‘best’”’? teacher would turn out to be the insecure 
teacher. Fearful of her ability, she may have 





WORCESTER 


put forth unusual efforts to secure results, driv- 
ing the class on this unit and neglecting other sub- 
jects of the curriculum. Possibly the more self- 
assured teacher, confident that she was doing a 
good job, would get lesser results on the test of 
this particular unit but might get greater ones in 
this subject in the long run and without neglecting 
other classes. 


Effective in All Subjects 





It is implied that a teacher who is effective in 
one subject is effective in others. Various condi- 
tions have existed among the studies reported. 
At the elementary level, some have measured 
progress of pupils in one subject in one of fouror 
more grades taught by one teacher. Some have 
looked for the gain in a unit of oneofseveral sub- 
jects taught in a single grade. Investigations in 
the secondary school have included teachers of a 
single subject—or at leat a single area, e. g. 
history or music—and those who have been teach- 
ing in more than one content area. The imolica- 
tion is that the teacher whose pupils make gains 
in, say, a social study will also secure gains in 
arithmetic, reading or geography. When compar- 
ison is made of teachers who are effective in their 
separate subjects with those who arenot so effec- 
tive, it is assumed that there is a common factor 
operating. That is, of course, not a necessary 
conclusion. 


Effective with All Kinds of Ability 





It is implied that the teacher is equally effec- 
tive with children of varying mental ability. The 
studies reviewed in this monograph yield little in- 
formation on this point. Usually there have been 
attempts to control such factors as intelligence, 
socio-economic background and previous achieve- 
ment rather than to see if different types of teach- 
ers are especially effective with different types of 
children. Some studies have revealed that child- 
ren of low initial achievement have gained more 
than those who started at a higher level, but the 
explanation suggested has not been interms of the 
special suitability of the teacher for pupils of that 
kind. 


Effective with Either Sex 





There is the implication that male teachers are 
as effective as females. Most of the elementary 
teachers who participated in these studies were 
female. On the secondary level, where both sex- 
es have given aid, some attention has been given 
to differences in characteristics between effective 
male and female teachers, but there has been lit- 
tle, if any, evaluation of sexdifferences in teach- 


ing effectiveness as such. While some studies 
have shown that the pupils of one sex have gained 
more than those of the other, no attempt has been 
made to identify particularteachers who are suc- 
cessful with boys or girls. Supervisors,how- 
ever, not uncommonly report that a certain teach- 
er is successful, say withgirls, but isover- 
whelmed by the disciplinary problems of boys. 
We have here, as in most of the implications a- 
bove, what we may call sub-implications, that 
males, or females, not only are as effective with 
girls as with boys, but that they are equally as 
effective at each grade level, ineach content sub- 
ject, in the development of character, social and 
emotional poise, and soon. In the United States 
during modern times there have been relatively 
few men teachers in the elementary grades. In 
many other countries, a large proportion have 
been male. Whether or not there is a sex differ- 
ence in teaching effectiveness, and if so, whether 
it is general, at certain levels, in certain sub- 
jects, with certain types of children, is a matter 
for research. In this connection, the use of mas- 
culinity-femininity tests might be indicated. 


Effective in Non-academic Areas 








An exceedingly important assumption is that 
the teacher who is effective inanarea or in areas 
of academic learning is also effective in develop- 
ing other educational objectives. The Northwest 
Ordinance of 1787 stated that ‘‘Religion, morality 
and knowledge being necessary to good govern- 
ment and the happiness of mankind, schools and 
the means of education shall forever be encourag- 
ed’’. Our schools have always accepted this com- 
mitment to teaching character and citizenship. 
Schools are supposed to make good boys out of bad 
boys. Some states have required by law that 
‘*character education’’ be taught. Some research- 
es have tested knowledge of what one should do in 
certain civic situations and what one’s attitudes 
should be towards civic problems. None have at- 
tempted to determine pupil gains in morality— in 
goodness as such—nor has there been evidence 
accumulated to show increase in civic behavior. 
There are many who would accept the implicit as- 
sumption that a teacher whose pupils show gains 
in mathematics will also measure growth inchar- 
acter. We suspect that some of them would not 
accept the converse assumption, that ateacher 
whose pupils have learned to be good citizens 
would also show significant gains in arithmetic. 

A more modern statement of non-academic 
goals of education is, perhaps, to say that the 
schools have a function inlife-adjustment, the de- 
velopment of personality, of social and emotional 
control. The school should produce well-adjusted, 
happy individuals. Where teacher behavior has 





JOURNAL OF EXPERIMENTAL EDUCATION 


been the criterion of efficiency, the supposed ef- 
fect of particular acts of the teacher onthe devel- 
opment of character and personality has beenem- 
phasized. Because of the lack of refined mea- 
sures and because of the time expected to achieve 
noticeable gains toward these objectives, they 
have been largely neglected in studies of pupil 
gain. 

In the meantime, supervisors of practice 
teaching and schcol administrators appear to op- 
erate on the assumption that an individual who is 
effective in one of the school functions is effective 
in all of them. It is seen by the finding that the 
general over-all rating assigned is as significant 
as the ratings on particular traits. 

Summarizing to this point— thereis a wide- 
spread assumption thatthere is a general charac- 
teristic which may be called teaching ability. A 
good teacher is effective inany school, at any 
grade level, with pupils of either sex and of vary- 
ing ability, in all of the subjects in her area of as- 
signment, academic or non-academic. Or, to put 
it another way, and perhaps more accurately, it 
is believed that some characteristics are present 
in all good teachers, wherever they are found. 
One such characteristic may be teaching ability. 
In his factorial analysis, Hellfritzsch (34) identi- 
fied in one situation a fifth factor to which he gave 
that name. He did not isolate itin other tests. 
Hampton (33) could not find such a factor. 

It should be mentioned again that the investiga- 
tors were careful not to generalize their findings 
in the ways stated above, but the implications re- 
main. While it seems doubtful that a general 
teaching ability exists, it would be unwise to make 
such an assertion at the moment. If we accept for 
the sake of argument that there is such a general 
factor of teaching ability, certain other questions 
immediately arise. Discussions of some of them 
will have pertinency even if there be no such gen- 
eral factor. Is it of genetic origin? When can it 
be identified? Is it subject to training? 


Teaching Ability of Genetic Origin 





As mentioned above, there are those who as- 
sume that teachers are born. They have a native 
knack for teaching as others dofor mechanical 
manipulation, scientific inquiry, or musical per- 
formance. Granted a favorable environment for 
its development, teaching ability will appear. For 
the true teacher, courses in howto teach are su- 
perfluous. Those holding this viewlike to point to 
the studies which show little relationship between 
grades in professional courses and pupil gain. 
Leaving to later a discussion of some of the pos- 
sible reasons for the lack of higher relationships, 
it may be noted here that in most of the studies 





reviewed in this project, the teachers were from 
the same institution, frequently the University of 
Wisconsin, that they had had work inprofessional 
education, all were certified teachers, in most 
instances they were volunteers, and in many in- 
stances (all of those at the University of Wiscon- 
sin), they were 2 selected group in terms of aca- 
demic grades. The researchesdo not reveal what 
results might have been obtained in the classes of 
those who loved to teach (being born to teach), but 
who had had no professional training at all. Such 
a study would be very interesting and might pos- 
sibly be undertaken by utilizing staffs of private 
schools in which certification is not required. 
Difficult, though not necessarily insurmountable, 
problems of control would be encountered. 


Early Identification 





If teaching ability be inherent, and even if itis 
not, but is associated with specific characteris- 
tics, it might be expected to express itself early. 
Some have stated that they ‘‘always wanted to 
teach’’ (9). Some children have been observed to 
play teacher even before they went to school. 
Students in teacher training courses at the Uni- 
versity of Wisconsin for several years have been 
asked to write autobiographies. In these, one 
might anticipate finding statements of early inter- 
ests in teaching. Also, a mass of information a- 
bout the students has been collected on student 
data sheets and the like. These have been includ- 
ed in such questions as ‘‘Why did you choose 
teaching as a profession?’’ and ‘‘What were your 
reasons for entering the School of Education?”’’ 
Usually they were not asked ‘‘When did you first 
think that you wanted to teach?’’ Analyses of 
these instruments (82, 64) have not shown when 
individuals first showed interest in or aptitude 
for teaching. 

Ringness found that interest inthe subject field 
in which high school teachers later taught appear- 
ed quite early—in the grades—but the decision to 
teach as revealed by the autobiographies did not 
come until the end of high school or even until af- 
ter entering college. That these are true dates 
may be questioned, however, asthe information 
was at no time directly requested. In these days, 
evidence accumulates that not only interest in such 
activities as music, art, dancing, but also that 
interest in science, mathematics, writing and the 
like frequently, if not commonly, may be identi- 
fied at very early ages. Itis byno means im- 
probable that the potentially effective teacher may 
also be detected among very young individuals 
when we learn what to lookfor, and we should 
certainly expect to do so if teaching be a native 
ability. 





WORCESTER 


Prediction Prior to Professional Training 





The elaborate data collected through the auto- 
biographies and on the data booklets is evidence 
of the assumption that the prospective teacher 
can, to some degree at least, be predicted before 
entering the professional teacher training pro- 
gram. However, Von Haden (82) examined sev- 
eral items from the booklets and the information 
from the autobiographies and found none which 
gave significant relationship toteacher effective- 
ness as measured by pupil gains or pupil evalua- 
tions. Rarely has the candidate for teacher traim 
ing been asked to present evidence of interest or 
aptitude for teaching in the application for admis- 
sion. Marks attained in courses are the usual 
criteria. 

Frequently, the high school principal is re- 
quested to make an estimate of the probability of 
the student’s success in college. He is not asked 
to forecast effectiveness in teaching. Lins (49) 
divided a group of teachers intothose who, at the 
time of applying for admission to the uni versity, 
indicated teaching as their first interest and those 
who listed it as the second or third interest. He 
reported no differences between these groups as 
to effectiveness. Leila Stevens (74), using a pro- 
jective technique, investigated the attitudes of 
high school students toward teaching as a career. 
She divided the group into those favorable and 
those unfavorably inclined toward teaching. How- 
ever, she presents no evidence as to the number 
from each group who later enteredtheteaching 
profession and therefore, of course, there is no 
knowledge as to their relative effectiveness. Al- 
fred Johnson also tried to find if there were per- 
sonality differences between those who said they 
were going to teach and those who did not plan to 
do so, but does not show which of these actually 
did go into this occupation. It would be valuable 
if the individuals who participatedin these studies 
could now be located and their success compared 
with the measures obtained earlier. 


Consistency of Effectiveness 





Clearly, the purpose of measuring teacher ef- 
ficiency is not so much tofind out how well she is 
doing right now, important as that is, as it is to 
predict how well she will continue todo. The 
general assumption is that teaching ability re- 
mains fairly constant. Only a few studies have 
been made of this, and none over any considerable 
time and using pupil gain asacriterion. One, 
Brookover (12), showed teachers who were over 
38 years of age securedlesser gains than did 
younger teachers. But we know nothing of the 
gains these teachers were securing when they 
were, say, 25 years old. 





Briggs (14) studied some teachers ten years 
after graduation and concluded that none of the 
earlier ratings correlatedwith the ratings of 
principals ten years later so as to have a predic- 
tive value greater thanchance. Some older 
teachers whose pupils have lesser gains were 
given higher general ratings by both superinten- 
dents and pupils. Possibly had there been mea- 
sures of pupil adjustment, the older ones would 
have shown the greater gains. 


Conditions of Development 





Another question arising from the assumption 
that there is a general ability for teaching is what 
are the conditions for its development. Talent 
does not often ‘‘bloom’’ without certain favorable 
conditions but, it is commonly held, no amount of 
favorable nurture can producetalent. The best of 
instruction and unending hours of practice will not 
produce a Bernstein from one having no musical 
talent. With talent, training may be effective, al- 
though with high talent, the amount of training 
necessary may be minimal. (There are instances 
of persons who have been very effective in dealing 
with the ill who had no medical training.) Two 
possible contrasting implications follow. If one 
is a born teacher, all that he needs is to know the 
subject he is teaching, and professional training 
will be valuable to those who possess teaching 
ability. It can do little for those who do not. 

We may have here a key to the inconsistencies 
between professional training, supervisor’s rat- 
ings and pupil gains. Schools of education, main- 
ly, perhaps because they have no way of estimat- 
ing native teaching ability, have admitted topro- 
fessional training almost anyone of good moral 
character, of reasonably high intelligence and 
satisfactory achievement grades, either in their 
high school or earlier college courses. The im- 
plication is that if one is a good student and/or 
has a fair level of mental ability, he benbe taught 
to be a good teacher. 

It is true that various studies have shown that 
grades in courses and mental ability are asso- 
ciated with pupil gain, but some studies have not 
found this relationship, and usually whenit is 
found, it is not very high— not high enough to jus- 
tif employing teachers on these criteria alone. 
Some other factor or factors are evidently operat- 
ing. Professional training may teach individuals 
how to act like teachers without necessarily giving 
them the ability to be teachers, as a medical col- 
lege may inculcate some of the traditional man- 
nerisms of the physician without producing an ex- 
pert diagnostician or surgeon. So it may be eas- 
ier to understand the disagreement bet ween rat- 
ings and gains. The supervisor may be judging 
the way the violinist holds his bow rather than the 





JOURNAL OF EXPERIMENTAL EDUCATION 


tone produced. Supervising personnel and, in- 
deed, the general public has built up a stereotype 
of a good teacher. Witha little practice, this 
stereotype can be acquired, on the one hand, and 
recognized and named, on the other hand, thus 
securing some agreement between teacher behav- 
ior and ratings. There remains tobe determined, 
however, the agreement between these ratings and 
pupil change. 


Competency of Raters 





The discussion immediately above leads to 
consideration of another assumption that has had 
a major influence in most studies of the effective- 
ness of teaching, viz., that supervisors, super- 
intendents, principals and board members are 
competent judges of good teaching. And here a- 
gain there is the implication thatthis is a general 
ability requiring little or notraining. It is just 
as natural for an observer to know agood teacher 
when he sees one as it is for the one observed to 
be one. Administrative officials recommend 
teachers, hire them, promote them, discharge 
them. But in the training of these persons is 
found very little attention on how to judge effec- 
tiveness. 

One well-known professor of sc hool adminis- 
tration has told this writer thatin his classes, the 
question is not considered at all. In none of the 
studies included in this report in which ratings 
have been employed has there been evidence as to 
the competency of those doing the rating. The 
professional training, the intelligence, the grade 
point average, the personality patterns of those 
rated have been studied extensively; the same 
questions have not been asked of those doing the 
rating. These raters are likely to be graduate 
students, professors of education, members of 
state departments, superintendents, supervisors 
of practice teaching, principals, etc. Some of 
them have been heads of schools in small com- 
munities and some have not actually worked in a 
classroom for a long time if ever. Many princi- 
pals give little time to systematically observing 
teaching procedures, examining testing methods, 
and otherwise really evaluating the results of 
teachers’ activities. The judgments of raters are 
frequently based upon personal attractiveness, 
willingness of the teacher to participate in extra- 
curricular activities, her presence in town over 
the week-ends, the frequency with which pupils 
are sent to the office for discipline, but not on 
evidence supported by tests of pupil progress. 

In almost all of the studies, ithas been assum- 
ed that a person with a certain title is a compe- 
tent judge. It is amazing that this assumption has 
not been more seriously challenged. Perhaps the 
reason is that in our school systems these per- 


sons are the ones who have the power to make de- 
cisions. The implication follows that the real 
working criterion of teaching success is the abil- 
ity to secure and to hold a position. Having said 
all of this, and in spite of the lack of evidence for 
a general trait which can be called teaching abil- 
ity, supervisors and placement officers seem to 
be confident that they cantell a goodteacher when 
they see one. 


General Intelligence 





Granted for the moment, again, that the exis- 
tence of a general factor which can becalled 
teaching ability is at least a matter of question, 
are there other possible general characteristics 
which, if not peculiar to, are necessary for, ef- 
fective teaching? One such factor may be intel- 
ligence. Several studies, many of them correla- 
tional, have indicated a positive relationship be- 
tween teaching effectiveness and intelligence, al- 
though not all have found this to be true and where 
correlations have been found, they are often low. 
Rostker’s (66) results indicated intelligence to be 
an important factor. Hampton (33) found signifi- 
cant relationships in certain situations but not in 
others. Hellfritzsch (34) discovered with eighth 
grade teachers a factor of ‘‘general knowledge and 
mental ability. ’’ 


Effectiveness Varies Linearly with Intelligence 








The common assumption has been that intelli- 
gence varies directly with effectiveness and it is 
implied that it is equally influential at all grade 
levels, in all subjects and with all typesof pupils. 
None of these implications has been varified. As 
has been noted above, the teachers of these inves- 
tigations were quite highly selected. While they 
represented rather well a cross-section of teach- 
ers certified through the University of Wisconsin, 
or in a few instances, as in Hampton, s, some 
other institution, they were definitely above the 
average, mentally, of the general population. It 
is not improbable that they were somewhat above 
the average of teachers in general. Whether or 
not these teachers were doinga better job in their 
classes than those trained in other institutions is 
not known. 

It may be that there is a straightline relation- 
ship between teaching effectiveness and intelli- 
gence. It is also conceivable that there is amin- 
imum level of intelligence necessary for success 
in teaching under any conditions and that be yond 
this minimum, various possibilities exist. Afur- 
ther increase in intelligence may not affect teach- 
ing effectiveness at all, or in decreasing or in- 
creasing degrees. In some occupations, for ex- 
ample, a certain visual acuity is required but a 





WORCESTER 


greater acuity is not associated with better per- 
formance. It is possible thatthe amount of intel- 
ligence required for good teaching depends upon 
circumstances. 

The assumption that intelligence is directly re- 
lated to teaching efficiency implies that the same 
amount of intelligence is required to be success- 
ful in the first grade as in the seventh grade or 
in the twelfth grade. Is it? Apparently the level 
of intelligence necessary to learn the material of 
the third grade is less than that needed to learn 
the subject matter of the eighth. The intelligence 
necessary to successful graduation from high 
school is greater than that one must have to finish 
the eighth grade. Are there similar differences 
in the intelligence needed to teach the subject 
matter of these grades? Is the mental ability ne- 
cessary for effective teaching of arithmetic the 
same as that for teaching spelling or civics? Or, 
may there be differences in the mental equipment 
for the teaching of advanced algebra and the teach- 
ing of manual training or music? And is the suc- 
cessful teacher of those of IQ’s of 70 necessarily 
of the same intelligence as the one whose pupils 
are of IQ 140 and above? 

A reply to these questions is that the teacher 
must not only know the subject matter to be taught 
but she must be able to understand the children of 
her classes and that the understanding of children 
of whatever type and in whatever class they may 
be demands similar degrees of intelligence. But 
do we know that this is true? A plausible hypo- 
thesis is that children of younger ages and of low- 
er abilities are,less complex organisms than 
those who are older or have higher ability. The 
teacher of the oider should understand not only 
these children at the moment but understand how 
they got that way. That is, she should be under- 
standing of both younger and older children. 
Therefore, perhaps differences in mental ability 
required of different teachers may be expected 
and desirable. It has been chargedin the past 
that teachers prepared innormal schools and 
teachers colleges for the elementary grades have 
been of less mental ability than those prepared in 
universities for the high school grades. Perhaps 
that is the way it should be. 

The above hypothesis may notbe a popular one 
among certain educational groups, and studies 
concerning it may encounter some resistance, but 
it would be well to explore it. One investigator, 
Rolfe (65) has already indicated that intelligence 
is more highly correlated with effectiveness in 
eighth grade classes when the teacher teaches on- 
ly that grade than in those in which the teacher 
teaches all of the grades. When approaching 
such an investigation, it should, of course, be re- 
membered that the level of intelligence has no es- 
sential relationship to importance. It may take a 


higher score on a conventional test of mental a- 
bility to secure an A in philosophy than to earn an 
A in dietetics— but a poor diet may be a major in- 
fluence on the kind of philosophy produced by the 
savant. 


Kinds of Intelligence 








There is also the question of kinds of intelli- 
gence. The common use in these studies of the 
Henmon-Nelson or the ACE test implies a theory 
of general intelligence. A logical thesis is that a 
test of primary abilities might reveal an associa- 
tion of verbal ability with success as asocial sci- 
ence teacher, while anumerical factor might 
have greater relationship with effectiveness in 
mathematics, and a spatial factor in the teaching 
of manual arts. If there be such athing as social 
intelligence, it may be of special importance ei- 
ther in the development of pupil adjustment or at 
least in the securing and holding of ajob. The 
low relationships and the inconsistencies found 
between intelligence test scores and pupil gains 
may be due to differences inthe amounts or kinds 
of intelligence requisite for effectiveness in the 
various subjects. Ronald Jones (42),for example, 
sought to determine the correlations between in- 
telligence and pupil gains in 14 high school sub- 
jects, but the design was not such as to make the 
comparison subject by subject. Evidence upon 
this subject might help determine not only the 
placing of teachers but also such questions as the 
departmentalizing of instruction, homogeneous 
grouping and team instruction. 


General Knowledge 





An assumption commonly held is that general 
knowledge, and especially knowledge in one’s own 
field, is related to teaching effectiveness. Hamp- 
ton (33) found a general knowledge-mental ability 
factor in successful high school teachers. Ronald 
Jones (42) found no significant relationships be- 
tween effectiveness and university grades or rat- 
ings. Inconsistent results among studies are the 
rule. Even marks in one’s major subject have 
not always proved to be good predictive criteria. 
It has been implied that one who knows his subject 
in general, or one who has a wide general know- 
ledge, is competent in the specific areabeing 
measured. The ‘‘it stands to reason’’ argument 
supports this implication. If a person knows cal- 
culus, he must understandlong division. If he 
has high grades in an English major, he should 
be able to teach spelling and grammar as well as 
short story writing or the appreciation of poetry. 
If the major is in social studies, the teacher 
should be competent in history of any country, ge- 
ography, or a unit on public health. 





JOURNAL OF EXPERIMENTAL EDUCATION 


The relationship between grade point averages 
and successful teaching has been usually found to 
be positive, but too low for individual prediction. 
However, a grade point average is a composite of 
scores in a variety of courses and even a record 
of straight A’s is not a guarentee of excellence in 
every phase of a subject. Much of the material 
studied in a student’s major is very different 
from the units taught inelementary or high school 
classes. It is not unlikely that the findings of 
Ronald Jones and others that high school averages 
were more closely related to teaching effective- 
ness in high school subjects than were college av- 
erages is because these high school teachers were 
teaching the things which they learned in high 
school. 


Linearity 


Just as it is conceivable that beyond a certain 
point, an increase of intelligence is not accom- 
panied by an increase in effectiveness, so it is 
possible that an increase in knowledge beyonda 
certain amount is of little significance. There 
are those who believe, without, so far as we 
know, any supporting or negating evidence, that 
an individual may know too much to be a good 
teacher of children at certain levels. Not only is 
a college major in mathematics not necessary to 
do an effective job of teaching arithmetic at the 
third grade level, is the argument, but it may ac- 
tually get in the way of good teaching. Or, it may 
be that the teacher who has very extended know- 
ledge of a field will find it difficult with those at 
the beginning levels patiently to limit himself to 
the process through which thelearners at this 
point are going. The argument will not appear 
reasonable to some, but perhaps it should not be 
rejected without consideration. 


Professional Training 





Training Required 





A very common assumption among profession- 
al educators has been that professional training 
is needed in order for one to become an effective 
teacher. It has been rather generally agreed that 
this training should include knowledge of the field 
to be taught, knowledge of thel earning process, 
an understanding of children and practice teach- 
ing. Other areas, such as the history and phil- 
osophy of education, and the knowledge of how to 
construct and use devices to measure the results 
of instruction are also frequently required. Just 
as it has been implied that if one knows his sub- 
ject, he can teach it, so it is implied that if one 
knows about children, he can teach them; that if 
he knows about the psychology of learning, he can 





choose the best ways of teaching. 


Ability to Evaluate Progress Implied by Ability 
to Teach 





Especially it has been implied that if one can 
teach, he can measure the results of teaching. 
This latter implication is of very doubtful validi- 
ty, and it is surprising that although teachers as- 
sign grades, recommend promotions and label 
pupils as high or low in scholarship generally, 
little training in testing has been included in many 
teacher training programs, and rating scales and 
schedules for observing teacher behavior contain 
almost no recognition of this function. Whereas, 
too, testing is not only a measure of,progress, 
but is an integral part of the learning process, 
this is a serious oversight. 


Professional Courses 





Studies have been disappointing in that no close 
relationships are usually found between grades in 
particular professional courses or between com- 
posite grades in these courses and teaching effec- 
tiveness. Correlations have been higher between 
grades in practice teaching and ratings of super- 
visors or superintendents than have been the cor- 
relations between professional courses in educa- 
tion and these ratings. This may be because su- 
pervisors and superintendents were looking more 
at the personal characteristics of teachers than 
at pupil progress, or it may be that the concepts 
taught in the professional courses have not been 
transferring directly to classroom practices. 
There is another possibility. Lange (47) sought 
to find the relationships between knowledge of 
specific concepts taught in certain professional 
courses and pupil gain. 


Similar Works and Courses Imply Same Concepts 





There is an assumption that those who have 
had professional courses of the same title have 
had the same training in that area in whatever 
state or institution these courses are given. This 
assumption is seen in the certification require- 
ments of the various states, many of which re- 
quire courses by title. In Lange’s study (47), all 
of the teachers had taken their professional work 
at the University of Wisconsin, so it might be ex- 
pected that all who had had the same courses had 
acquired the same concepts. Several sections of 
the basic course investigated by Lange are given 
at the University of Wisconsin and are taught by 
different instructors. It was found that in many 
instances the instructors did not agree. It could 
hardly be expected that the re would be a signifi- 
cant relationship between concepts presented in a 





WORCESTER 


course and teacher effectiveness if the concepts 
were given one meaning in one section and anoth- 
er meaning in another section. If all of the 
teachers had developed their understandings un- 
der the same instructor, a relationship might 
have been found—though whether the correlation 
would be positive or negative might depend upon 
which instructor it was. Barr has pointed out 
that many terms are used to denote the same 
characteristics or behaviors. It is here empha- 
sized that the same words mayconvey many 
meanings. An investigation conducted among 
teachers from a smaller institution where all did 
have their professional work under a single in- 
structor might be of interest. 


Teacher Behaviors 





Those who direct professional training of 
teachers and those who build rating scales are 
likely to assume that there are certain charac- 
teristics of teacher behavior which result in the 
effective learning of pupils. Thisis based on the 
prior assumption that there is a specific way in 
which learning goes on and that she whose be- 
havior is most suited to this learning process is, 
of course, the most effective teacher. It follows, 
too, that there is a basic method— learning by 
rote or the method of discovery, for example, 
for all areas— the acquisition of academic know- 
ledge, manual skills, the ability to solve prob- 
lems or the development of citizenship. Itis 
recognized, of course, that there may be varia- 
tions in the application of the method. Subtrac - 
tion by either the additive or bor rowing method 
can be taught by rote. On the above assumption, 
schemes of observing teacher behavior have been 
employed. Notes may be made of the amount of 
talking done by the pupils and by the teacher, the 
number and types of questions asked, the com- 
ments upon pupils’ responses, the number and 
kinds of aids which are provided, the techniques 
employed to maintain order in the classroom. 
Where there is agreement among those observing 
as to the interpretation to be givento the specific 
elements of teacher behavior and where there has 
been practice in making and classifying observa- 
tions,quite consistent ratings of teachers have 
been frequently, but not always, obtained. Ob- 
viously, the value of the technique of obser ving 
teacher behavior is dependent upon the evidence 
that this particular bit of behavior will actually 
have this effect upon the pupil and that this effect 
will result in the desired learning. Evidence on 
these points is scanty and agreement among ob- 
servers of teachers may mean only mutual assent 
to a particular philosophy or psychology of edu- 
cation. Are we sure thatthe sametype of teach- 
er behavior is equally effective in different 





learning situations and are we sure which theory 
of learning is the better to follow at the moment ? 
If a teacher who was nurtured in the Thorndikian 
S-R hypothesis is observed by an enthusiastic 
follower of field theory, how will she be rated? 


Rating Scales 


It is surprising how little attention has been 
given on teacher rating scales andon observation 
records to actual methods of teaching and the di- 
rection of learning. Rather, the observer usual- 
ly notes personality characteristics, ifthe teach- 
er is alert, if she appears enthusiastic and co- 
operative, if she seems to be interested in the 
children, if she has good discipline, if she is 
forceful and the like. Ratings may estimate the 
teacher’s knowledge of the subject, her profes- 
sional attitudes, whether she has provided sup- 
plementary materials and perhaps whether she 
demands good performance from the pupils. 
Sometimes it is noted if she makes adjustments 
to individual differences, but rarely is there a 
record of what specific adjustments aremade. 
There is, to be sure, no way by whichthe observ- 
er, on no more than one or two occasions, can 
measure the adequacy of adjustment to the indi- 
vidual child. In order to make arating, he would 
have to know the characteristics of the pupil, his 
previous performances, his present stage in the 
learning process, andsoon. Inthe meantime, 
the observer gives a plus value on the chart to 
adjusting for individual differences and a minus 
value for giving more attention to one child than 
to another, but perhaps relatingto the same 
teacher behavior. 

There are few ratings found onthose behav- 
iors which relate toteaching and learning as such. 
What did the teacher dotodevelopa specific con- 
cept? How did the pupils respond to this presen- 
tation? What was the next step? How and when 
were errors detected, and what was done to elim- 
inate them? The assumption has seemed to be 
that if the teacher has a friendly personality and 
respects the personality of the pupil and that if 
she is active, enthusiastic and in good standing 
with the others of the school personnel and in the 
community, then she is an effective teacher. 


Training in Making Observations 





It has been noted that those who obser ve and 
rate teacher behavior can be trained so as to se- 
cure a high degree of consistency and agreement. 
This is to be expected when they have agreed be- 
forehand to judge a teacher who sits at her desk 
a good deal as passive, andone who moves 
around a lot as active, one whogives much help 
to a child as partial, and the one who asks the 





JOURNAL OF EXPERIMENTAL EDUCATION 


same number of questions of eachchild as impar- 
tial, and soon. To judge whetheror nota teach- 
er is helping a child according to his need is an- 
other question and requires more information for 


its answer than is usually obtained by the observ- 
er. 


Jayne (37) did try to get at some of the items 
of teacher behavior which at least possibly might 
be rather directly related tothe learning of pupils. 
He found no relationship between the amount the 
teacher talked and pupil gain. He did not analyze 
the talk to identify the learning method which 
might be involved. He further investigated gain 
and retention in terms of amount and spacing of 
teacher participation, the num ber and kindsof 
questions asked— whether fact or t ho ught ques- 
tions. The relationships between kinds of ques- 
tions and gain seemed to vary with circumstances 
as did the value of critical comments by the 
teacher. There was speculation that thekind of 
questions appropriate to one purpose might not be 
suitable to another which again suggests that there 
is no one method of teaching opr learning appli- 


cable alike to all areas and levels of achievement. 


Jayne found no specific teacher activity signifi- 
cantly associated with pupil gain. 


Personality of Teachers 





It has been mentioned earlier that rating 
scales and systematic observations of teacher be- 
havior have given much attention to personality 
traits. This is no doubt aresult of the empha- 
sis in modern schools upon ‘‘the whole child. ”’ 
To test the assumption that there are certain as- 
pects of personality associated with effective 
teaching instruments, such as temperament 
scales, personality scales, measures of neurotic 
behavior and the like have been used. Studies of 
individual traits and composites of traits have 
been reported (46, 57, 41). Some of these have 
been subjected to factorial analysis. These traits 
of teacher personality have been related to super- 
visors rating or to pupil gains in academic sub- 
jects or in what might be called academic atti- 
tudes. That is, personality factors of teachers 
have been compared with pupil gain in, for exam- 
ple, knowledge about public health and with their 
attitudes toward public health or with their civic 
beliefs. There has been little attempt to relate 
the teacher’s personality withchanges in the per- 
sonality of the pupils. 

Once more the implicationis that these per- 
sonal characteristics of the teacher which are 
favorable to development in academic knowledge 
are also favorable to growth in per sonal adjust- 
ment, or vice versa. And there is also the im- 





plication that the same pattern of traits is ap- 
propriate for all pupils at all levels of learning. 
Another paper of this report tells of the attempt 
to discover some characteristic or composite of 
characteristics which is present in all good 
teachers and absent in poor ones, or some factor 
present in poor teachers but absent in good ones. 
The results seem to be inconclusive. One study 
found that considerateness, although highly rated 
by supervisors and pupils, was negatively corre- 
lated (though not significantly so) with pupil gain. 
Indeed, no specific trait or com posite oftraits 
has been found which is inevitably associated with 
teaching effectiveness. These studies have not 
attempted to find if there be patterns of personal- 
ality traits of teachers especially effective with 
pupils who, too, present particular patterns of 
personality. It is a perfectly good hypothesis that 
the permissive teacher will get good results with 
one type of pupil, while the rigid disciplinarian 
will succeed much better with another type. A 
very forceful, active teacher may be well adapted 
to a class of very bright children and not so well 
adapted to dull children. The lackof studies con- 
cerning the relation ofteacher traits to personal- 
ity development of children is probably due to the 
lack of good instruments for measuring such 
growth and to the fact that changes in personality 
are likely to come slowly and cannot be adequate- 
ly evaluated in the time available for the investi- 
gations. 

Studies of teacher personality and observation 
of the behavior of successful and unsuccessful 
teachers are, of course, of real value and sub- 
stantial progress inthe understanding of factors 
in effectiveness has resulted fromthem. UIlti- 
mately, they must be related to change in pupils 
in some aspect of learning for which the school 
has accepted responsibility. 

Teachers deal with groups of children, groups 
which are heterogeneous along several dimen- 
sions. If some characteristics or some methods 
seem to be associated with good results with a 
considerable proportion ofthe group, and many of 
the studies reviewed in this monograph have de- 
vised ingenious and rewarding techniques for the 
discovery of these, then the superior teacher will 
be the one who possesses enough of these to serve 
the larger number of pupils effectively. 

As the numbers of children warrant it, classes 
may be divided according tothe characteristics of 
pupils, and teachers assigned whose own charac- 
teristics are appropriate for these children. At 
the moment, it may be desirable to employ teach- 
ers with different patterns of personality, differ- 
ent degrees of intelligence and so on for different 
grades and different fields of subject matter on 
the chance that each child will encounter some 
teacher whose specifications will fit his. In the 





WORCESTER 


meantime the search will goon for the discovery 
of factors of general significance. 


Special Characteristics 





Speech 


At one time or another, there have been as- 
sumptions concerning the importance to teacher 
effectiveness of certain special characteristics. 
One of these is speech. It is so commonly be- 
lieved that teachers must possess good speech 
that many schools of education require prospec- 
tive teachers to take a speech test andto receive 
special training in speechif certain defects ap- 
pear. It is believed that both good and bad 
speech qualities are imitated—and, therefore, 
the teacher should present a good model— that 
certain qualities of voice are irritating and so in- 
terfere with learning, and that, of course, the 
speech be intelligible. 

McCoard (55) did find significant relationships 
between general effectiveness of speech, but not 
between any one element of speech, and pupil 
gain, although the correlation was not high e- 
nough to justify its use as an index of teaching a- 
bility. On the other hand, Jayne’s study yielded 
a correlation of -. 09 between speech and pupil 
gain. Neither of the investigations attempted to 
establish a cut-off point in speech proficiency be- 
low which the teacher would surely be ineffective. 
Perhaps here again we have a factor important to 
a certain degree but unrelated to success beyond 
that point. 


Discipline 


The ability to maintain discipline stands high 
among the desirable traits of a teacher as rated 
by superintendents and principals. The lack of 
this ability appears to be the cause ofalarge 
proportion of teacher failures. It isassumed 
that learning is more efficient in a well ordered 
classroom. It is also one of the means by which 
character is developed. It is implied that if the 
class of the teacher being observed is at the time 
running smoothly, or that if the teacher sends 
few children to the office, there is good disci- 
pline. It is inferred further that the teacher who 
has a well-managed class in one situation will be 
able to function similarly in all situations. Yet 
there are reports of the teacher who can get along 
well with girls but who cannot handle boys and of 
the one who can do well witha fourth-grade class 
but is overwhelmed by seventh graders. 

Styles concerning discipline change. At one 
time the rigid taskmaster who ruled witha fer- 
rule— whose pupils, be there only three of them, 
must ‘‘rise, turn, and pass’’ at dismissal time 





was considered to be a good teacher. In the past 
few years, there has been a tendency to approve 
the permissive teacher. The noisy classroom is 
not condemned but commended if it be ‘‘the noise 
of industry.’’ Presently, there may be something 
of a return toward stricter behavior requirements. 
One of the very likely reasons for disagreements 
between raters and between ratings and pupil 
gains is the difference in attitudes toward class- 
room management. Ifthe superintendent is a dis- 
ciplinarian and the classroomteacher has been 
trained in a philosophy of permissiveness, the 
teacher is likely to get a low rating, not only on 
discipline, but because of the halo effect, on her 
general effectiveness. 


Other Special Considerations 


Conformity to Local Standards 





Various other special considerations are as- 
sumed to affect the teacher’s success. One teach- 
er is rated negatively (14) because she went home 
nearly every week-end. A teacher is not a good 
one unless she participates in community affairs 
—but of course in only certainaffairs. What ef- 
fect these things have on pupil gains or on pupil 
adjustment is uncertain. A rural individual (28), 
14), is taking a heavy chance if she undertakes to 
teach in an urban community— which perhaps ac- 
counts for so much ineffective college teaching. 
A mature person may not succeed wellinan ‘‘im- 
mature’’ community. Even per sonal attractive- 
ness may be a hazard if the ideal is to make 
teaching a permanent occupation—‘‘It appears 
that the more attractive teachers left the profes- 
sion sooner than the less attractive ones. ’’ (14) 
While all of the above assumptions are undoubted- 
ly based upon some observations and known prac- 
tices, the objective e vide nce to support them is 
not abundant. And sometimes contrary assump- 
tions are met. Not concerned that a rural girl 
may not fit into a city school, many superinten- 
dents not only ask that their candidates have ex- 
perience, but are perfectly willingto accept those 
who have had that experience in rural schools — 
on the assumption, apparently, that if one is a 
good teacher in one situation, she will be good in 
any situation. 

Similarly, in one community, ‘‘father-being-a 
farmer’’ had a positive correlation with ratings 
as a teacher, while in another community, it was 
not a significant factor. To have been engaged in 
athletic or musical activities seems to be an im- 
portant consideration inthe ratings secured in 
some places. In others, these factors have no 
relationship whatever. Perhaps the assum ption 
can be justified that to be effective, the teacher’s 
characteristics need to match the requirements 





JOURNAL OF EXPERIMENTAL EDUCATION 


of a particular superintendent or a particular 
community. Special factors, such as those dis- 
cussed here, are probably of less moment in lar- 
ger communities than in smaller ones, but they 
are not absent. One superintendent of a quite 
large city school is knownto have stated that tall, 
slender women are better teachers than short, 
chubby ones. Although the effect of such matters 
on pupil gains is lacking, itis still good policy to 
try to place a teacher in terms of, among other 
things, the information at hand concerning the 
local school and community. A state officer who 
had rated a teacher low, althoughthe superinten- 
dent had rated her much higher, said that if the 
superintendent himself had been more competent, 
he too would have given a low rating. It is pos- 
sible, however, that a teacher who would have 
won the commendation of the state officer might 
have been out of place withthat particular super- 
intendent, suggesting further that teaching effec- 
tiveness is a matter of many interrelated factors. 


Health 


It is assumed that physical health and freedom 
from physical defects are associated with teach- 
ing effectiveness. Examinations whose purposes 
are to protect pupils from infections lie outside 
of the present discussion. But beyond these, it 
is difficult for a person who has a noticeable de- 
fect, other than that evidenced by not too thick 
lenses, to secure a teaching position. Teaching 
forcefulness and apparent supply of physical en- 
ergy are often judged to be associated. There is 
little evidence on this matter or on the degree of 
physical defect which can be tolerated before a 
reduction of effectiveness appears. 

Similarly, the degree of mental health neces- 
sary for efficiency, either generally or in par- 
ticular situations, has not been established. 


Use of Aids 


It is assumed that a teacher will make use of 
various available materials and aids to instruc- 
tion, and it is often assumed further that if ma- 
terials are not supplied, som ehow she will pro- 
vide them. Or it may beassumed that a good 
teacher will not need these supplementary sup- 
plies. The little evidence that there is, however, 
is that those schools which have the more abun- 
dant facilities seem to be producing the better 
results. These studies here reported have not 
indicated what instructional aids were at hand for 
the teachers to use. 


Philosophy 


Differences in philosophy present a serious 
danger in estimating the efficiency of teachers 





through rating scales and especially through the 
records of observed behavior. The scales or re- 
cord blanks were devised interms ofa particular 
philosophy. If the observersare trained so as to 
achieve reliability, it means that they have agreed 
that a certain behavior indicates a certain char- 
acteristic and, further, that this characteristicis 
desirable or undesirable. These decisions may 
not be assented to by those holding a different 
philosophy. Obviously, when these observations 
are made of a teacher who has adifferent philos- 
ophy, the rating will tend to be low. In the ab- 
sence of objective effects upon pupils, such rat- 
ings should be viewed with caution. 

It may be interesting to illustrate differences 
in evaulation of a teacher as a result of differing 
criteria. Miss K, as reported by Briggs(l14), 
was one of 14 out of aclass of 58 who were grad- 
uated from the University of Wisconsin and who 
were still teaching ten years later. In her first 
position, which was for one year, she was rated 
by her superintendent as ‘‘average’’. During her 
second year, she taught part-time in each of two 
schools. One of her principals was said to be 
‘‘very complimentary’’ about her work and 
‘would have liked to retain her permanently. ”’ 
The other principal was also ‘‘enthusiastic. ’’ The 
third position was for one year and no information 
about her success there was available. At the 
time of the investigation, she had been in her then 
present position for seven years. The principal 
rated her ‘‘good average.’’ Among her assets as 
listed were: good knowledge of her subject, car- 
ried out assignments well, took aninterest in pu- 
pils, maintained fairly good order in her classes 
without using repressive measures, good health, 
has traveled extensively. Interviews with her pu- 
pils indicated that she was interested in them, 
permitted them considerable freedom in art clas- 
ses, appeared to be wellliked by them. The staff 
considered her competent inher field. The princi- 
pal always knew where she stood. 

Reading thus far, some will probably see in 
Miss K a fairly good teacher. But Miss K is list- 
ed, too, as having some liabilities. Tothe ob- 
server, she seemed self-satisfied. She was ‘‘not 
too ready to over-do— possibly lazy.’’ She was 
unable or unwilling to analyze herself, lacked en- 
thusiasm, did not have afriendly manner, was 
not too generous with her talents, seldom volun- 
teered suggestions for plays, musicals, etc. , for 
school programs, went home week-ends, had no 
part in the social life of the community, had not 
prepared any special exhibit of the pupil’s art 
work, had taken only two or three professional 
courses since beginning to teach. 

The mere fact that Miss K hadtaught ten years 
and was rated average or above by her principals 
is not, of course, proof that she was agood teach- 
er. However, some who examine her record and 





WORCESTER 


note that most of the liabilities were the impres- 
sions listed by one who had hadone interview with 
the teacher and had observed one of her classes 
for part of one class period would still be inclin- 
ed to rate her as successful. 

While no specific evidence is given, there is 
implication in the record that as an undergradu- 
ate Miss K had rather low grades although there 
was ‘‘apparently reasonable success in practice 
teaching. ’’ She was willing to work but was satis- 
fied with inferior work. She was immature but 
willing to accept responsibilities. This was not 
an excellent record, but not by any means a fail- 
ing one. 

The final evaluation of Miss K was: ‘‘Miss K’s 
case was a rather obvious one of im proper oc- 
cupational choice. Had the case history been 
compiled and presented at the time she applied 
for admission to the School of Education,it is very 
unlikely that any committee would have approved 
her application. There seems to have been suf- 
ficient evidence available at that time, hadit been 
assembled in case study form, to have indicated 
that she would be a very doubtful risk as a teach- 
er.’’ The neutral observer finds it difficult to 
understand this devastatingly negative report. 
Until the criteria of success are more objective, 


there will be little likelihood of finding those 
characteristics which measure teaching effec- 
tiveness. 


Motivation 


Teacher’s Motivation 





There is agreement that an understanding of 
motivation is necessary to an adequate explan- 
ation of any effective performance, although only 
in recent years has this subject beenthe object of 
systematic research. Withrespect toteaching 
effectiveness, there are two practical aspects. 
One relates to the teacher: Why do persons 
choose to enter the teaching profession? The oth- 
er relates to the pupils: How successful are 
teachers in influencing their pupils to want to 
learn? 

Ringness (64) has presented inthis monograph 
a review of some of the theoretical considera- 
tions relating to motivation and has summarized 
his study of the motives underlying the choice of 
teaching as a career. Anassumptionhere is that 
the reason for the choice is related to effective- 
ness as ateacher. It is not surprising, as Ring- 
ness pointed out, that those who assert that they 
teach from a desire to be of service and who eu- 
logize the profession are acceptable totheir prin- 
ciples, nor is it surprising that relatively few 
teachers would admit that they chose the work be- 
cause the professional courses were a snap and 





the profession was easy to getinto. Not included 
in Ringness’ list of motives, thoughthey were rec- 
ommended in a study by Best (9) whichthe teach- 
ers checked, were ‘‘liking to teach’’ as such nor 
‘*love of children or young people.’’ These are 
rather vague values, to be sure, but perhaps no 
more so thanthe ‘‘welfare of society.’’ In any e- 
vent, the assumption that one whoteaches from a 
desire to promote learning secures higher pupil 
gains or better development of character tha 
does the one who is motivated to do a good job in 
order to be promoted or in order to get funds to 
do something other than te ac hing has not as yet 
been conclusively proven. This is anarea in 
which further research is much to be desired. 


Pupils’ Motivation 





Another assumption is that the teacher who is 
able to present her subject in such a way that 
learning can easily be accomplished is also one 
who can persuade her pupils not only to want to 
learn it, but more than that, to wish to continue 
learning. For, in the last analysis, the criterion 
of effective learning isthe creation inthe individ- 
ual of a desire to learn more. A goal of educa- 
tion limited to the learning of merely what is pre- 
sented in the classroom would surely be a low 
one. 

Little attention has been given in these studies 
to judging teaching effectiveness on the basis of 
engendering in pupils a desire to learn. Indeed, 
in several of the researches, attempts have been 
made to control motivation by selecting po pu la- 
tions from similar socio-economic backgrounds, 
thus possibly eliminating an essential factor of 
the function to be measured. Whether or nota 
different set of teacher characteristics is needed 
for working with those pupils who are already ea- 
ger to learn than is requiredfor those whose ed- 
ucational aspirations are very low is another sub- 
ject for research. One hypothesis might be that 
the teacher who has elected to teach because of 
his own interest in a particular subject-matter 
field and the opportunity teaching will give him to 
continue his study in that field will be more suc- 
cessful with those who have already learned to 
like to learn, while the teacher who has entered 
the profession from a desire to serve humanity 
will be more effective with those who have not yet 
developed any enthusiasm for the intellectual life. 
Perhaps a third set of motivations is desirable 
for the teacher of those whose interest is definite- 
ly directed toward the acquisition of only occupa- 
tional skills. It would appear, too, that occa- 
sionally a self-motivated individual appears for 
whom the most effective leader behavior is to get 
out of his way. 

The discussion of this paper has emphasized 





JOURNAL OF EXPERIMENTAL EDUCATION 


that there are a multitude of assumptions and im- 
plications inherent in the problem of understand- 
ing the characteristics of effectiveteaching. The 
role of the teacher is a highly complex one. 
Many and conflicting pressures are at work which 
influence teacher and pupil be havior and which 
determine to various degrees success inteaching. 
Many aspects of the problem have been intensive- 
ly and ingeniously attacked in the researches re- 
ported in this volume. Many other aspects re- 
main to be explored. It seems increasingly 
doubtful that any single characteristic ofthe 
teacher herself, her training or herbehavior, 
will be indicative of her effectiveness. It is 
probable that complex interrelationships among 
the teacher, the pupils, the school administration 
and the community are at work. Different com- 
binations of the same elements yield substances 
having very different properties. Different com- 
binations of teacher characteristics almost cer- 
tainly will be found to be importantfor maximum 
effectiveness in different situations. The next 
step in research may well be totry to discover 
what these interrelationships may be. 


Summary 


All investigations into, and discussionsof, 
teaching effectiveness are based upon assump- 
tions. Some of these have been stated explicitly 
by those doing the research here summarized. 
Many more of them have beenonly implied, some 


probably not brought to the level of consciousness. 


Not all of the assumptions are present in every 
piece of research. Some are incompatible with 
others. Some particular ones appear under more 
than one general assumption. All, and undoubt- 
edly others not mentioned here, will eventually 
require consideration. Following are assump- 
tions, stated or implied, in some one or more of 
the studies reviewed in the present volume: 

1. There is a general teaching ability—a talent 
for teaching. It is the same for each sex. It 
is effective at all grade levels, with alltypes 
of pupils— bright, dull, secure, insecure, 
etc. ; in all communities, in all areas of 
learning—-academic, social and moral. It is 
of genetic origin. It canbe recognized early. 
It is constant. It can be recognized by ob- 
serving teacher behavior. It can be identi- 
fied by measuring certain teacher or pupil 
performances toward particular goals of the 


school. Training programs, though having. 


some value in its development, cannot cre- 
ate it. 

2.- Teaching abilityis a result of training. 
Knowledge of the subject matter is sufficient 
training. Professional training is essential: 
This to include understanding of the individ- 
ual taught and his development— physical, 





mental, social; the understanding of the 
learning processes; appropriate methods of 
teaching and classroom management; meth- 
ods of evaluating progress in learning; prac- 
tice teaching, and perhaps the pur poses of 
American education. 

Words used in professional courses havethe 
same meaning for instructors and convey the 
same meaning to students. 

Teaching ability can be objectively measured 
by the means of pupil gains in academic sub- 
jects, personality, and character. 

Teaching ability can be objectively evaluated 
by trained observers of teacher activities 
such as amount of talking by teachers, num- 
ber of questions asked of students, impar- 
tiality, caring for individual differences, ap- 
parent energy. 

Subjective evaluation of teaching performance 
has considerable validity. 

Teaching effectiveness canbe adequately rat- 
ed by supervisors, superintendents, princi- 
pals, subject-matter specialists. 

Ratings given teachers are independent of the 
philosophies of those pre paring the rating 
scales. 

Certain personality characteristics or pat- 
terns of characteristics are essential to ef- 
fective teaching. These are the same for all 
ages, all types of pupils and all types of 
learning. Certain characteristics are effec- 
tive in certain situations and other character- 
istics in other situations. 

Intelligence is directly related to teaching ef- 
fectiveness. This refers to general intelli- 
gence. Certain of the primary mental abili- 
ties are important, depending upon the situa- 
tion. Effectiveness may depend upon special 
abilities. 

Effectiveness is determined by motivation of 
the teacher. Some types of motivesresult in 
greater success than do others. 

Effectiveness of teaching is determined by the 
motivation of pupils. 

The teacher can be of major influence in mo- 
tivating pupils. 

Certain special conditions are related to ef- 
fective teaching, such as that permissive dis- 
cipline produces the better results; that rigid 
discipline produces the better results; that the 
speech of the teacher affects learning; that the 
health of the teacher affects learning; that 
socio-economic status affects learning; that 
effectiveness is determined by conformity to 
local standards. 

Teaching effectiveness is a matter of an al- 
most infinite number of interrelationships a- 
mong teachers, pupils, administrative per- 
sonnel, colleagues, the community, influenc- 
ed by inherent talent and professional train- 





WORCESTER 


ing so that at best, perhaps, identification of 
individuals with some very high ability or 
several average abilities and no completely 
negative one is all that can be hoped for. 

. The statistical techniques employed have been 
adequate to isolate the factors determining 
effectiveness, that linearity exists between 
effectiveness and such factors as intelligence, 
knowledge, health, emotional stability. 

. The characteristics displayed in limited and 
highly controlled situations are descriptive 
of those which willfunction similarly in other 
and more ‘‘natural’’ conditions. Techniques 
employed in research are applicableina 
field situation. 

. Equal gains in various school subjects, as 
measured by standard score techniques, rep- 
resent truly equal degrees of growth. 

. Traits found to be characteristics of goodor 
poor teachers as a group exist in individuals 
who are good or poor. 

. The performance of a teacher when partici- 
pating in a research program is representa- 
tive of her customary or best performance. 

. The influence of suchfactors as home and 
community aspirations, previous educational 
influences and the educational equi pmentof 
different schools can be adequately controlled. 

. The philosophies of education and psychology 
underlying the research programs, the con- 
struction of measuring instruments and the 
evaluation of behavior is acceptable and com - 
mon to those involved. 

Obviously, not all of the assumptions inthe in- 
vestigations here summarized are included in this 
list. Many of the assumptions mentioned have 
been investigated in the researches re viewed in 
this monograph. Although not infrequently re- 
sults have been inconclusive, much progress has 
been made in the determination of the factors in 
teaching effectiveness. There are stilla large 


number of assumptions remaining to be explored. 


Practically all of the studies have been limit- 
ed ones. Data have been gathered according to 
specific procedures, from definite sources, at 
particular educational levels and have been inter- 
preted according to standards agreed upon for 
purposes of the investigation. Many of the stud- 
ies have been in terms of pupil growth academic- 
ally or of teacher behavior supposed to be influ- 
ential in promoting academic growthor personal- 
ity development. Few have attemptedto measure 





both academic and personality growth— to say 
nothing of morality or civic behavior. No study 
has undertaken to evaluate the effectiveness of 
the teacher on the educational effort as a whole-—— 
growth academically, in personality, in charac- 
ter, in the desire to continue learning, inthe 
teacher’s personal traits and abilities as interre- 
lated with the traits and abilities of the children 
in groups and as individuals, in terms of those of 
differing levels of mentality or achievement, in 
schools differently equipped, in communities of 
differing socio-economic stati and educational as- 
pirations operating under different educational 
philosophies. 


It is possible that ratings of principalsand 
superintendents come nearer than research in- 
vestigations or the observations of teachers ac- 
cording to specified schedules to being eval ua- 
tions of the teacher’s function as awhole. The 
rating official may see the teacher in light of the 
entire educational situation rather than in terms 
of a particular child, a certain class, a specific 
subject or a definite event. The overall rating 
should, possibly, be the best one. But the situa- 
tion is not ideal. Rating officials have not been 
trained in the process of rating. They have not 
learned for what specific things tolook. They 
never have the opportunity to judge a teacher ad- 
equately with respect to many of the essential 
conditions. They rarely have knowledge of the 
precise measures used to evaluate pupil growth 
even in those areas where such measures are a- 
vailable. Those measures which they do consult 
are in only limited areas of the total school pur- 
pose. They resort, then, to judgments concern- 
ing classroom discipline, personal attractiveness, 
apparent alertness, behavior in faculty meetings, 
reports of pupils, parents, other teachers and the 
like, all interpreted in the light of their own ed- 
ucational philosophy or the phil os ophy of the in- 
vestigator who has prepared a rating chart for 
them. And when rating charts are used, or when 
training in rating is undertaken, the evaluations 
return to the evaluation of specifics— not the edu- 
cational task as a whole. 


The many studies here reported, incomplete 
as they are, have been necessary toprovide cer- 
tain preliminary soundings. More of them are 
needed. Then, as certain factors specific to par- 
ticular situations appear, the attempt will follow 
to establish their interrelationships. 





JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 30, Number 1, September 1961) 


CHAPTER XII 
TEACHER EFFECTIVENESS AND ITS CORRELATES 


A. S. BARR 


It is the purpose of this chapter to provide a 
summary of the major findings of the series of 
investigations here summarized. It may be ob- 
served first that there are many approaches made a. Some are concerned primarily with 
to teacher evaluation e m phasizing various con- teacher competencies, performance, 
siderations: and behavior; 

. Some are concerned primarily with per- 
sonal fitness, character and personal- 
ity traits and qualities of the person; 


4. Investigations relative tothe measurement 
and prediction of teacher effectiveness 
have different psychological concerns: 


I. Some general observations relative to the var- 
ious approaches made toteacher evaluation 
1. Investigations of the measurement and 








prediction of teacher effectiveness may 

serve many purposes: 

a. To validate admission practices; 

b. To validate teacher education curricula 

c. To validate placement and employment 
practices; 

d. To determine on-the-job effectiveness. 

. Investigations of the measurement and 

prediction of teacher effectiveness differ 

greatly in scope: 

a. Some aim to determine effectiveness; 
some to predict effectiveness; and some 
attempt both; 

. Some aim to determine teacher effec- 
tiveness in some particular aspect of 
teaching in acarefully controlled situ- 
ation; and some aim to determine gen- 
eral teacher effectiveness with respect 
to all teacher responsibilities and in a 
variety of situations. The same aims 
could apply to prediction; 

. Some aim to determine effectiveness 
with reference to some particular cri- 
terion and some aim to determine 
teacher effectiveness with reference to 
a multiple criterion, 

. Investigations relative to the measure- 

ment and prediction of teacher effective- 

ness employ different criteria: 

a. Some employ supervisory and admin- 
istrative efficiency ratings; 

b. Some employ time and frequency stud- 
ies of behavior to draw inferences that 
become criteria of teacher effective- 
ness; 

*. Some employ composites of scores on 
tests thought to measure important pre- 
requisites of teacher effectiveness; 

d. Some employ measures of pupil change. 





. Some are concerned primarily with 
those knowledges, skills, and attitudes 
that constitute the prerequisites 
to teacher effectiveness; 

. Some are concerned primarily with the 
products of teacher activities and only 
incidentally with teachers as such. 


5. Investigations of the measurement and 


prediction of teacher effectiveness define 

teaching differently: 

a. Some have in mind, chiefly, those ac- 
tivities that one might associate with 
the teacher as a director of learning; 

. Some wouid define teaching to include 
responsibilities for pupil guidance; 

. Some would include the extra curricular 
responsibilities of the teacher; 

. Some would include the school-commu- 
nity responsibilities of the teacher; 

. Some would include the extra-school 
professional responsibilities of the 
teacher. 


6. Investigations of the measurement and 


prediction ofteacher effectiveness employ 

different sorts of data-gathering devices 

constructed for different purposes and rep- 
resenting different theories of measure- 
ment: 

a. Some are checklists aiming to take cog- 
nizance only ofthe pressure or absence 
of certain behaviors and characteris- 
tics of situations; 

. Some aim merely to record the times 
spent upon various sorts of activities 
or the frequency withwhich they occur; 

. Some aim to attach values to observed 
behaviors as in rating scales; 

. Some are reporting devices (question- 
naires and inventories) wherein the sub- 
ject or someone answers questions; 





e. Some are ranking devices, paired com- 
parisons, andthe like, that depend up- 
on statedor inferredcriteria, upon the 
availability of the facts, and upon the 
trustworthiness of the reporter; 

f. Some are testing devices conforming 
more or less to the conventions of test 
construction; 

. Some are verbal records, diaries, ob- 
servers’ notes, and various sorts of 
documents. 

. Investigations of the measurement and 
prediction of teacher effectiveness have 
been predominantly correlational studies 
and have all the limitations of such stud- 
ies: 

a. The coefficient of correlation measures 
going-togetherness and not cause and 
effect relationships. Many of the as- 
sumed relationships are possibly spur- 
ious; 

. Coefficients of correlation cannot be 
taken at face value. The size of a co- 
efficient of correlation depends upon 
many things, only one of which may be 
the relationship between the variables 
under investigation and the criterion; 

. Coefficients of correlation make as- 
sumptions about the nature of human 
abilities, the most obvious ones being: 
that they can be measured, that abili- 
ities are additive, and that there is 
sufficient stability in measures and 
functions to permit reliable predic- 
tions, all of which may or may not be 
correct; 

. The coefficient of correlation makes 
certain specific mathematical assump- 
tions, some are sometimes met by 
these investigations and some are not. 

. Investigations of the measurement and 
prediction of teacher effectiveness employ 
different psychological orientations: 

a. Such investigations involve theories of 
mind-body relationships and the bio- 
physical foundations of hu m an behav- 
ior; 

. Such investigations involve concepts 
such as those of aptitude, ability, per- 
formance, competency, skill, and ef- 
ficiency. These concepts need better 
definition; 

. Such investigations involve theories of 
human behavior: mental organization, 
motivation, problem solving, percep- 
tion, learning, transfer, and the like, 
that have long been the concern of psy- 
chologists; 








135 


d. Such investigations involve theories of 
the structure of human abilities, the 
commonest being that abilities are di- 
visable and additive. 

. Such investigations involve theories 
relative to the specificity and general- 
ity of human abilities. 


Some investigators seem not to realize that 
thoroughly competent persons may view teacher 
effectiveness differently and that there are many 
different ways of combining these different em- 
phases in teacher evaluation research. 


II. Categories of Effectiveness 





One of the facts brought home, time and time 
again by these investigations, isthat researchers 
in this area employ a very extensive vocabulary 
in describing teacher effectiveness. Not only is 
the list of descriptive terms used to describe 
teacher effectiveness and its prerequisites very 
long, but there is a tendency to use words ina 
non-technical layman’s sense rather than in a 
manner characteristic of scientific study. Even 
where the investigators define terms they are 
likely to define them ina manner personally pref- 
erable to the investigator rather than in more un- 
iversally useful ways. This problem is one that 
has long plagued the behavioral sciences. There 
are very few operational definitions. 

Even were the vocabulary more adequately 
defined there is an overwhelmingly large number 
of terms employed in discussing the assessment 
and prediction of teacher effectiveness. There 
are too many terms employed to make meaningful 
communication possible for the many individuals 
who need to communicate about teacher effective- 
ness and make predictions as inthe selection, ed- 
ucation, and placement ofteachers in pre-service 
programs of teacher education and in the super- 
vision, administration, and improvement of the 
staff in in-service. To secure more manageable 
lists of descriptive terms, both individual and 
group attempts have been made to combine, con- 
dense, andtelescope these many terms into short- 
er lists of one sort or another; on the objective 
side, factor analysis has been employed. 

From the concensus approach, progress 
would appear to have been made in the develop- 
ment of somewhat abbreviated list of terms and 
categories that may prove useful in discussing 
the personal prerequisites to teacher effective- 
ness. Thetwenty-five personality traits suggest- 
ed by Charters* andothers have been further re- 
duced to fifteen qualities by the researches here 
summarized with appropriate synonyms. This 
list needs further reduction. 


*W. W. Charters and Douglas Waples, The Commonwealth Teacher-Training Study (Chicago: University 





of Chicago Press, 1929). 





JOURNAL OF EXPERIMENTAL EDUCATION 


. Buoyancy—Surgency, optimism, enthusiasm, 
cheerfulness, gregariousness, unsuspi- 
ciousness and uninhibitedness, talkativeness, 
sense of humor, pleasantness, carefreeness, 
vivaciousness, alertness, wittiness. 


. Considerateness—C oncern for the feelings 
and well-beingofothers, tolerance, sympa- 
thy, understanding, unselfishness, patience, 
helpfulness, friendliness, easy -goingness, 
geniality, generousness, warm-heartedness, 
thoughtfulness, kindliness. 





. Cooperativeness—Proneness toward joint ac- 
tion, willingness to share responsibility, 
readiness to work with others, respect for 
others, helpfulness when things need to be 
done, agreeable to working with others, a 
good team worker. 





. Dependability— Reliability, loyalty, honesty, 
punctuality, responsibility, conscientious- 
ness, accuracy, painstakingness, trust- 
worthiness, sincerity. 


. Emotional Stability—Realism in facing life’s 
problems, freedom from emotional tensions, 
not easily upset, poised, self-controlled, re- 
laxed, steady, unhurried, consistant. 





. Ethicalness—Good taste, modesty, morality, 
conventionality, cultural polish, refinement. 


. Expressiveness—Skill incommunication, re- 
sponsiveness, verbal fluency, articulate- 
ness, agreeableness of voice, good inflec- 
tion, audibility. 





. Flexibility—Capacity for approaching things 
in a novel manner, imaginativeness, adapt- 
ability, inventiveness, initiativeness, orig- 
inality, creativeness, enterprisingness, re- 
sourcefulness. 


. Forcefulness—Ascendance, dominance, con- 
fidence, independence, self-sufficiency, 
self-reliance, persistence, purposefulness, 
intending to accomplish, persuasiveness, 
commanding respect, aggressiveness. 


. Judgment—Wisdom inthe selection of appro- 
priate courses of action, discretion in deal- 


ing with others, foresight, common sense, 
clearheadedness. 


. Mental Alertness—Brightness, intelligence, 
academic aptitude, capacity for thinking, 
power to comprehend. 





. Objectivity—Fairness, impartiality, open- 
mindedness, freedom from prejudice, use 





of factual evidence in making criticisms and 
decisions. 


. Personal Magnetism—Attractively dressed, 





good physique, absence of distracting physi- 
cal defects, absence of distracting manner- 
isms, cleanliness, posture, personal charm, 
appearance. 


. Physical Energy and Drive— Readiness for ac- 





tion, drive, physical vigor and energy, de- 
termination, desire to get things done, vi- 
tality, endurance. 


. Scholarliness—Scholastic aptitude, thorough 


knowledge of subject, being well informed on 


many subjects, high verbal aptitude, widely 
read, literateness. 


In another approach, a list of the major re- 


sponsibilities oftheteacher has been suggested in 
the Wisconsin adaptation of the M Blank of the 
Evaluative Criteria, under the four broad cate- 
gories as follows: 


A. 


B 
C. 
D 


The teacher as a director of learning 

The teacher as a friend and counselor of 
pupils 

The teacher as a member of a school com- 
munity 

The teacher as a member of associations of 
professional workers 


Barr (5) in a summary of research relative to the 
measurement and prediction of teaching efficiency, 
besides categorizing the personal characteristics 
of teachers, found investigators concerned with 
categories as follows: 


A. Various sorts of skills 


Skill in identifying pupil needs 

Skill in setting and defining goals 

Skill in creating favorable mind sets, 
readiness, motivation 

Skill in choosing learning experiences 
Skill in directing the learning process 
Skill in using learning aids 

Skill in teacher-pupil relations 

Skill in appraising pupil growth and 
achievement 

Skill in appraising management 

Skill in developing suitable work habits 
Skill in the use of language and other me- 
dia of communication 


B. Knowledges 


1. Knowledge of child behavior, development 
and learning 


2. Knowledge of subject taught or activity di- 


rected 
3. Knowledge of professional practices and 





techniques 
4. General cultural background 
5. Scholarship 


C. Interests, attitudes, ideals 

1. Interest in pupils 

2. Interest in subject taught or activity di- 
rected 

3. Interest in teaching and related profes- 
sional activities 

4. Interest in school and extra-curricular 
activities 

5. Interest in community 

6. Interest in self-improvement 

7. Social attitudes and values 


In the products approachtothe measurement 
and prediction of teacher effectiveness, there is 
no substantial categorizing of the products of ed- 
ucation. The main products would appear to be 
effects on pupils, effects on co-worker, effects 
on parents and other members of the community, 
and effects upon professional workers wherever 
they may be and by whatever means. The pro- 
ducts may also be material things: course out- 
lines, teaching aids, textbooks, professional 
books tndarticles, and whatever the tangible pro- 
ducts of the teacher’s efforts may be. There is 
need for careful definition of products. 

Turning in another direction, some progress 
may have been made in developing categories 
through the use of factor analysis techniques. 

Hellfritzsch (34), Hampton (33), Schmid (70), 
Lamke (46) and others made factor analyses. 
Hellfritzsch made two factor analyses. From 
one of his analyses he found four factors which he 
chose to designate bothby letters and verbally as 
follows: 


GKMA General Knowledge and Mental Ability 

TRS Teacher Rating Scale Factor 

PESA Personal, Emotional, and Social 

EATP Eulogizing Attitude Toward the Teach- 
ing Profession 


In a second analysis he found the following fac- 
tors: 


GKMA General Knowledge and Mental Ability 

TRS Teacher Rating Scale Factor 

PEA Personal Emotional Adjustment 

EATP Eulogizing Attitude Towardthe Teach- 
ing Profession 

PGTA Teaching Ability Factor 


The chief components of the last named fac- 
tor were liberal social beliefs and a sound work- 
ing knowledge of the symptoms, causes, and rem- 
edies for various kinds of pupil adjustment prob- 
lem. 

Schmid (70) analyzed the data for men and 





137 


women separately. From the analysis of female 
scores he found the following factors: 


PRS Problems Response Set 
PM Professional Maturity 
INT Introversion 

SA Social Adjustment 


Two common factors were found from the analy- 
ses of male scores: 


SEA Social and Educational Adjustment 
PPF Personality- Psychological Factor 


The Washburne test scores possessed the 
greatest discriminative value for measuring so- 
cial and educational adjustment and the Mooney 
Problem Check List the best measure for the per- 
sonality-psychological factor. The full meaning 
of these conclusions canbe hadonly from a care- 
ful study of the data themselves. In comparing 
these results with those of Hellfritzsch, one ob- 
serves that research design, the measures em- 
ployed, and vocabulary preferences in naming 
factors may give what appears to be on the surf- 
ace different factor structure. The two studies 
agree, however, in emphasizing personal-social 
adjustment and related matters. 

Hampton (33), basing her analyses upon a 
twelve-item rating scale and several different 
groups of teachers found common factors as fol- 
lows: 


Professional mindedness, i.e., teachers attitude 
toward the school and administration 

Technical competencies, i. e., knowledge of sub- 
ject matter, discipline, emotional poise, 
general culture, resourcefulness, speech 

Physical attributes, i.e., health, vitality, per- 
sonal appearance 


Subsequent analyses appeared to support the first 
analysis in some respects but with very definite 
shifting of groupings in several instances. 

Lamke (40) with yet adifferent research de- 
sign and using the Cattell 16 PF test concludes 
that good teachers are likely to be talkative, 
cheerful, placid, frank, and quick, while poor 
teachers are likely tobesilent, depressed, anx- 
ious, uncommunicative, and languid. 

While these studies make some contribution 
to the structuring of teaching ability, it seems 
reasonably apparent thatthe structure secured is 
quite clearly tiedtothe researchdesign, the data- 
gathering devices employed, the teachers studied, 
the value system, and the vocabulary of the par- 
ticular investigator. The progress in this direc- 
tion, when viewed in retrospect, seems quite 
meager. 

In brief, the purpose of this particular sum- 
ary, that is, the summary in this section, is to 





138 


JOURNAL OF EXPERIMENTAL EDUCATION 


suggest that there is a problem of terminology in 
the area of evaluating and predicting teacher ef- 
fectiveness. For practical reasons the list must 
be shortened, but it must be done according to 
established fact; theory is important but theory 
is not enough. The researches here summarized 
appear to make some contribution to the categor- 
ization of human ability. 


Ill. Some Further Observations that Seem to Grow 





Out of the Investigations Here Summarized 





x 


While there are many possible designs 
for research inthis area, the researches 
here summarized are for the most part 
descriptive investigations carried on 
with available populations or total popu- 
lations in particular situations. For the 
most part, these investigators have not 
attempted to generalize beyond the pop- 
ulations studied. Whether one needs to 
limit the findings tothe populations stud- 
iedor whether one might correctly gen- 
eralize beyond these populations to some 
larger population of which these popula- 
tions may be representative is a moot 
problem of research design. 


Within a descriptive, exploratory frame 
of reference, the researches here sum- 
marized employed a great variety of re- 
search techniques: 

a. There are a very large number of 
status studies, each design chosen to 
fit the purposes of a particular re- 
search, in which observation, tests, 
interviews, questionnaires, andother 
data-gathering devices were em- 
ployed to ascertain the status of some- 
thing, as for example, the opinions 
of teachers, superintendents, and ed- 
ucational specialists about the com- 
petencies needed by teachers, the at- 
titudes and beliefs of teachers on all 
sorts of matters, the expectancies of 
school board members about the per- 
sonal livesofteachers, what teachers 
do in the classrooms, and various 
situational factors. 

. There were studies of the likenesses 
and differences among and between 
good and poor teachers relative to age, 
training, sex, classroom behavior, 
temperament, personal characteris- 
tics, beliefs, and many other things. 

. There were a limitedfew experiment- 
al studies pertaining to both the pre- 
service and in-service characteris- 
tics of teachers. 

d. There were many short-term and 
long-term case studies pertaining to 





many aspects of teacher effectiveness 

. There were many simple correlation- 

al analyses. 
There were a number of factorial 
analyses based upona variety of data- 
gathering devices and pursued with 
reference to different aspects of 
teacher effectiveness. 

. There were a very large number of 
prediction studies employing regres- 
sion techniques. 

. There were many fol low-upstudies 
of the students graduated from the 
University of Wisconsin performed 
primarily for evaluative purposes. 
There were anumber of validation 
studies aiming to test the validity of 
particular data-gathering devices, 
such, for example, as the Minnesota 
Multiphasic Personality Inventory, 
various tests of teacher-pupil rela- 
tions, personality rating scales, and 
a teacher judgment scale. The in- 
strumentation was usually com plex 
and the statistics relatively simple, 
the degree of statistical sophistication 
varying from investigation to inves- 
tigation. 


The two problems, namely, the problem 
of estimating and predicting a teacher’s 
effectiveness in a particular situation, 
and the problem of estimating and pre- 
dicting the teacher’s effectiveness in 
general, i.e., ina variety of situations 
with a variety of pupils aiming to achieve 
a variety of purposes within a defined 
area of specialization, should be clear- 
ly differentiated. To make this differ- 
entiation one might observe first that 
every situation is composed, from one 
point of view, of two sets of factors: 1) 
those unique to the particular situation 
and the conditions surrounding it, and 
2) those factors, elements or aspects 
common to all or to groups of situations 
within a defined area of specialization. 
One’s concern may not extend beyond a 
particular teacher in a particular situa- 
tion, but even so, she may have charac- 
teristics that are common with other 
teachers in other situations. If one’s 
concern is with the problem of general 
teaching ability, the research design will 
be different. One ordinarily begins by 
defining the populations to be investigat- 
ed. These populations are not merely 
populations of teachers, but of pupils, 
purposes, and conditions. Sampling 
techniques are frequently essential to 
this type of concern. 





Insofar as the investigations here sum- 
marized are concerned with the general 
teaching ability of particular teachers, i.e., 
the ability to achieve the purposes of educa- 
tion with many sorts of pupils in a variety of 
situations for a given type of assignment or 
area of specialization, the design of the in- 
vestigations is not suchas to provide this in- 
formation. The designs are too localized to 
supply this information. To supply this in- 
formation there would need to be much more 
variety in pupils, situations, and purposes 
for each teacher to establish general teach- 
ing ability even within a limited area of spe- 
cialization. 


4. A variety of criteria of teacher effective- 
ness have been employed in these inves- 
tigations, namely, a) ratings of teacher 
efficiency made by many different types 
of raters; other teachers, pupils, teach- 
er training personnel, supervisors, and 
superintendents with the aid of many sorts 
of rating scales; b) pupil gain scores on 
tests thought to measure pupil learning 
(most of these scores were adjusted for 
the effects of measurable pupil factors in- 
fluencing learning); c) composite of scores 
on tests of qualities thought to be associ- 
ated with teacher effectiveness; d) sys- 
tematic studies of the classroom behav- 
ior of teachers and pupils; e) grades in 
practice teaching; and f) grade point aver- 
ages inall subjects, professional courses, 
majors and minors, taken as a part of the 
teacher education program. In this con- 
nection, it can be further added, from the 
data inthe different studies it would ap- 
pear that these different criteria measure 
different aspects of teacher effectiveness, 
i.e., judgments about whether a teacher 
is effective or not may depend upon the 
criterion used. 


. Criterion building is a complex and diffi- 
cult undertaking. Some of the problems 
associated with criterion building were 
discussed in some detail in an earlier 
chapter. While the different criteria give 
different results, it would seem reason- 
able to assume that the different criteria 
with some extensions might give approx- 
imately the same results. Some of the 
matters that need special attention, if 
comparable results are to be secured from 
different criteria, are: a) There must be 
an agreed upon defintion of teaching 
(teaching may mean many different things) . 
b) If anelementalist (constituent) approach 
is made to the study of teacher effective- 
ness, by whatever method, the coverage 





139 


should be reasonably complete. Differ- 
ences in coverage would be expected to 
lead to differences in results. If, for 
example, the personal characteristics 
approach is used in criterion building, 
it might be extendedto include both per- 
sonal and professional characteristics; 
if pupil gain is used as the criterion, 
then it might be extended to cover per- 
sonality changes in pupils as well as sub- 
ject matter outcomes, and if the basic 
prerequisites approach is employed, then 
the criteria may need to be extended to 
include a wider range of knowledges, at- 
titudes, and skills; c) the vocabulary 
used in all approaches will need to be 
more carefully defined and equated from 
one approach to another. Operational 
definitions might be used where appro- 
priate; d) more attention must be given 
to data-gathering processes. It has long 
been said that tests must be valid and re- 
liable, but these concepts must be ap- 
plied to all data-gathering devices (sam - 
pling and drawing inferences from be- 
haviors present very special difficulties), 
and e) special attention must be given to 
the manner of summarizing data. In 
some instances they would appear not to 
be additive. 


While all of the criteria present special 
problems, Worcester, inan earlier dis- 
cussion in this summary, has called at- 
tention to the inadequate training of the 
educational personnel for their evalu- 
ative responsibilities. He was speaking 
particularly of efficiency ratings, but 
this inadequacy extends to almost all 
aspects of educational evaluation. It is 
particularly applicable to the pupil gain 
criterion. Two problems that need par- 
ticular attention in the use of this cri- 
terion are: 1) the measurement problem, 
and 2) the reparation of teacher effects 
from other matters that produce changes 
in pupil growth and achievement. 

With reference tothe first, while now 
more thanahalfcentury has been devot- 
ed to test construction, there are many 
important educational outcomes without 
valid and reliable measuring instru- 
ments. While those investigators using 
the pupil change criterion in the studies 
studies here summarized were well 
aware of this matter and while attempts 
were made to measure certainofthe non- 
informational outcomes, the inadequacy 
is present nonetheless and must have 
continued attention. The greatest lacks 
were in the realm of personality devel- 





JOURNAL OF EXPERIMENTAL EDUCATION 


opment and pupil growth. Beyond this, 
the low correlations found between the pu- 
pil gain criterion and the other criteria 
can be traced, among many other things, 
to the lack of curricular validity of the 
tests employed. Teachers are an inde- 
pendent lot. While some doubtlessly taught 
for the test objectives and content, there 
are probably just as many that went their 
own free way, and this will influence the 
relationships studied. As to the inter- 
mixture of effects, the adjusted pupil gain 
score, used in the investigations repre- 
sented an attempt to take into considera- 
tion fairly commonly recognized non- 
teacher influences. There were many 
more. Incidentally, all of these investi- 
gators concerned themselves with the im- 
mediate learning effects, the more remote 
may be more important. 


. There is much unevenness in the abilities 
of teachers, i.e., they may be low in some 
abilities andhighinothers. There appear 
to be very few multi-talented teachers. 
From thedatahere summarized, it might 
be hypothesized that a good teacher is one 
that has one or more special talents and 
no deficiencies that the school, commun- 
ity, or administration consider critical, 
with many intermediate abilities spread 
more or less normally between these ex- 
tremes. 

In this connection, the question arises 
as to whether abilities are additive or not, 
that is, whether shortages in one ability 
may be made up by an abundance in an- 
other. This would appear not to be true; 
if true, then possibly only for closely cor- 
related measures. It is possible that 
there may be a number of uncorrelated 
factors in teacher effectiveness each with 
an unspecified minimum cut-off point be- 
low which one may not fall and be an ef- 
fective teacher. Falling below this cut- 
off point in any one critical factor, even 
though the individual may be high in all 
other factors, may make the teacher un- 
acceptable to the administration, other 
teachers, pupils, and parents, or other- 
wise ineffective. It is not clear to what 
extent the investigators associated with 
the researches here summar ized were 
aware of this possibility or considered it 
a hypothesis worth of investigation. 


- Much time was spent in the early studies 
in the development of the foundations for 
the objective study of teacher and pupil 
behavior. Inthese activity analyses much 
time was spent intranslating the then very 





subjective language employed in describ-' 
ing teachers and teaching into observable 
pupil and teacher behaviors, with studies 
of their objectivity and validity. Much 
time andeffort was also spent upon means 
of recording behaviors, including time 
charts, activity check lists, and sound re- 
corders for recording the verbal behav- 
iors of teachers and pupils. The first 
sound recorder making such records was 
developed by Barr, Department of Educa- 
tion, and Koehler, Electrical Engineering 
Department, University of Wisconsin. 


. Many of the early studies were concerned 


with teacher-pupil behavior and teacher- 
pupil relationships. Barr, for example, 
studying classroom behavior of teachers 
found that good teachers attended care- 
fully to pupil responses; she frequently 
smiled appr ec iatively; she was patient; 
she laughted with the class from time to 
time; she conducted discussions in a con- 
versational manner; she was enthusiastic; 
she made frequent use of pupil exper- 
iences; she possesseda variety of ways of 
commenting upon pupil responses; she had 
a good sense of humor; andshe socialized 
class discussions. 

One of the hy potheses developed by 
these early studies was that teacher acts 
have an appropriateness aspect; they are 
not good or bad in general but in relation 
to purposes, persons, and situations. Un- 
der such an hypothesis one would expect 
teacher behaviors, collected out of con- 
text, not to distinguish good teachers from 
poor teachers. The parts may be present 
but scrambled as it were, the effective- 
ness of one teacher over another comes 
in the ordering ofthe parts, i.e., in their 
appropriateness. Most teachers through 
courses in pedagogy and long experiences 
as students come to know a very large 
number of the constituents of teaching, 
but it takes a high level of creativeness 
and understanding to create new constitu- 
ents and to get them properly ordered. 
Aside from this, investigators, too, have 
a problem of how best to order teacher 
behaviors. One way for investigators to 
order teacher behaviors is to approach 
them withdefinite hypotheses. They might 
for example, be ordered from the point of 
view of appropriateness and as they con- 
cern purposes, persons, and situations; 
from the point of view of important learn- 
ing principles such as individual differ- 
ences, readiness, motivation, and knowl- 
edge of progress in learning; andfrom the 
point of view of human relationships such 





as respect for and considerateness in the 
treatment of pupils, permissiveness in 
teacher-pupil relations, and pupil orient- 
ed instruction. Data relative to teacher 
behavior which may not be very important 
in and of themselves in distinguishing 
good teachers from poor teachers may be- 
come exceedingly important as materials 
for building inferences about teacher ef- 
fectiveness. 


. In keeping with the frequently expressed 
interest of employing officials in the per- 
sonality of teachers, these investigators 
almost without exception explored the use 
of personality measures as predictors of 
teacher efficiency. The instruments were 
those most frequently found in the litera- 
ture of thetime: the Thurstone Tempera- 
ment Scale, the Minnesota Multiphasic 
Personality Inventory, the Cattell 16 PF 
Scale, the Guilford- Zimmerman Temper- 
ament Survey, the Washburne Social Ad- 
justment Inventory, the Morris Trait In- 
dex L, the Bernreuter Personality Inven- 
tory, the Bell Adjustment Inventory, the 
Torgerson Theory and Practice of Mental 
Hygiene, andmany scales and inventories 
constructed for the particular investiga- 
tions. No attempt was made except by 
Bollinger (11) and Singer (73) to relate 
the personality ofthe teachers to personal 
changes inthe pupils. The absence of 
such data arises probably out of the dif- 
ficulties that one encounters in isolating 
the teacher effects from other influences 
such as those of parents, other teachers, 
and other pupils that impinge upon pupils. 
Then, too, these changes come about 
slowly spreading over considerable inter- 
vals of time more than the span of most 
investigations. The reference here is not 
to at-the-moment pupil behavior, fre- 
quently investigated in teacher-pupil in- 
teraction studies, but to the residual per- 
sonality changes which are more difficult 
to isolate and to measure. 


. In keeping with the principle of individual 
differences, there is considerable evi- 
dence in the researches here summarized 
to indicate that teachers differ in both the 
respects in which they succeed and the re- 
spects in which they fail. This would 
seem to holdtrue not merely when the dif- 
ferentiation is made in terms of behavior 
but, also, in terms of the basic prerequi- 
sites (knowledges, attitudes, and skills) to 
effectiveness and interms of personality. 
Some of these individual differences doubt- 
lessly arise out of the unique character- 





141 


istics each teacher possesses and not en- 
compassed by the characteristics shared 
by other teachers. A statement of this 
sort is presumably true for all generali- 
zations. Generalizations cover what is 
common to a number of persons, situa- 
tions, and events, and leave not covered 
the uncommon. But beyondthisthere may 
be reason to believe that where the cri- 
terion is an efficiency rating, each rater 
has his own preferences, notwithstanding 
the fact that they employed a common 
rating scale, and hetends to rate high 
those teachers who Imve the characteris- 
tics that we associate withexcellence, and 
to rate low those teachers with an absence 
of those characteristics. While a com- 
mon cultural and a common educational 
background would tend to produce like- 
nesses among raters, there may still be 
enough individuality left in raters to pro- 
duce differences in the ways in which 
teachers fail and succeed, i.e., from the 
point of view of supervisors, administra- 
tors and educational specialists. 


. Teaching does not take place in a vacuum; 


it takes place ina very definite tangible 
situation. This aspect of teacher effec- 
tiveness is so pervasive that it needs more 
attention than it has yet received. Effec- 
tiveness does not reside inthe teacher 
per se but in the interrelationship among 
a number of vital aspects of a learning- 
teaching situation and ateacher. It is 
common practice to characterize the ef- 
fective teacher intermsof qualities of the 
person; time has seen the emphasis shift 
from the teacher per se to the teacher in 
relation to the more important aspects of 
a situation: needs, purposes, pupils, 
available means, and the socio-physical 
environment for learning and teaching. 
This needs careful consideration. 


. From the data presented in these investi- 


gations, there would appear to be certain 
broad categories of effectiveness that 
should be recognized and more generally 
used in the evaluation and prediction of 
teacher effectiveness, such for example, 
as a cognitive category composed of such 
matters as intelligence, academic apti- 
tude, and knowledges such as those asso- 
ciated with subject matter specialization, 
cultural background, and professional 
competency; an affective category com- 
posed of such matters as interest, atti- 
tude, motivation and value systems as 
they relate to the major com ponents of 
teaching; a physical fitness category com- 














JOURNAL OF EXPERIMENTAL EDUCATION 


posed of good health, energy and drive 
and the many bio-physical factors in ef- 
fectiveness; a personal fitness category 
composed of qualities such as personal 
magnetism, coo perativeness, buoyancy, 
considerateness, emotional stability, eth- 
icality, expressiveness, forcefulness, in- 
telligence, judgment, objectivity, physi- 
cal energy, reliability, resourcefulness. 
and scholastic proficiency; a professional 
competency category where knowledges, 
skills, attitudes, and personal character- 
istics are put together in action which can 
be further characterized; and a general- 
ized skills category composed of skills 
such as the arts of communication, skill 
in human relations, and the many manip- 
ulative and verbal skills associated with 
teaching. One is im pressed by the lack 
of agreement among investigators in their 
approaches to teacher evaluation and in 
the vocabulary usedtodescribe it. Every 
investigator would appear to be a law un- 
to himself with his own categories. Pos- 
sibly the time has now arrived for more 
agreements in approaches. 





. As one studies these many investigations 
the need for some generally acceptable 
comprehensive theory of the nature and 
structure of human abilities, as exhibited 
in teaching ability, is strongly felt. It is 
not that these investigators were not well 
versed in psychological theory but that 
each, like most investigators inthis area, 
has his owntheory, carefully stated or 
assumed. For progress in research in 
any area there is need for individuality, 
but for action there is also need for con- 
census such as might be had by teams of 
scholars and the school personnel looking 
for agreements. This monograph presents 
no such theory of human abilities, but its 
authors believe that some of the building 
blocks for sucha structure have been 
moved to a building site. 


. Some persons will consider the studies 
here summarized too intricate and detailed 
to be very helpful to the practitioner. In 
that connection, two observations are 
offered: a) These are exploratory studies. 
It is common observation within the his- 
tory of science and invention that the first 
attempts at implementation and instru- 
mentation are frequently quite crude and 
primitive. These investigators have been 
looking for leads. Many of their leads 
willgrowup. And b) if one expects school 
administrators to set anexample in the 
humane management of its personnel, their 





interests must extend beyond hiring and 
firing teacherstohelpingthem, and to the 
more general development and conserva- 
tion of humanresources. From this point 
of view, the aim of these investigations 
has been to suggest different sets of cri- 
teria with appropriate instrumentation that 
might serve as preliminary screening de- 
vices. These preliminary screening de- 
vices should provide information relative 
to the overall effectiveness of teachers, 
with some ideaastotheir major strengths 
and weaknesses. The more detailed ex- 
amination of weaknesses, if there are 
such, will need to be explored through the 
aid ofother professional groups and other 
specialists who provide services relative 
to the bio-physical and psychological foun- 
dations of human behavior, i. e., with good 
instrumentation the job is not complete. 


. As one reads the literature, examines 


these investigations, and thinks carefully 
about the problem, one gets the impres- 
sion, with a few excellent examplesto the 
contrary, that many of these investigators 
have been too closely hemmed in by cer- 
tain sets of realities tothe neglect of 
others. Ina certain sense it is not so 
much what the situation is as what one 
thinks it is, and these discrepancies be- 
tween reality and what one thinks must al- 
so be investigated. There are many per- 
ceptions of, and expectancies relative to, 
the various aspects of each learning- 
teaching situation that must have closer 
study, and attitudes that must be explored. 
One will find many instances in these 
studies where they have been explored, as 
for example, the studies of the perception 
and expectations that high-school seniors 
hold with reference toteachers and teach- 
ing, that school board members have of 
teachers, or that teachers hold of them- 
selves, and the annoyances and satisfac- 
tions that attend teaching. But they need 
more systematic consideration. 


. While only limiteduse was made of factor 


analysis, the factors secured appeared to 
be products of the research design, the 
data-gathering devices employed, the na- 
ture of the teaching assignment, and the 
vocabulary preference of the investigator. 
To increase the usefulness of factor anal- 
ysis methods, more attention must be 
given to psychological orientation, the 
data-gathering devices, and the original 
correlation to which factor analysis tech- 
niques areemployed. The papers by Abel 
and Schmid seem to show that there are 





also very much improved statistical pro- 
cedures that one may employ with im- 
provement in the information secured. 


. It is possible that the regression technique 
so generally employed in these investiga- 
tions does a reasonably good job in pre- 
dicting potential, but many teachers do 
not live up to the expectations relative to 
them. There are many things that bring 
about these discrepancies: lack of inter- 
est in the subject taught, in the pupils, or 
in teaching itself, attitudes and values not 
acceptable to the administration, to the 
majority of the teachers, parents 
and pupils; personal and professional in- 
compatibilities of various sorts; and in- 
flexibilities that may limit each teacher’s 
effectiveness. 


. There is some evidence in these investi- 
gations that the so-called ‘‘efficiency 
rating’’ may be to acertainextent a com- 
patibility rating. The extent to which this 
might be true would appear to depend 
greatly upon the rater and somewhat upon 
the rating instruments employed. 


. In looking back over the investigations 
herein summarized, one might ask what 
progress has been made? In brief, pro- 
gress would appear to have been made in 
clarifying the problem, in indicating some 
alternative way of structuring teaching 
ability, in indicating some of the compo- 
nents of teaching ability, and in indicating 
some matters that need to be kept in mind 
in developing designs for future research. 
While the researches here summarized 
very in quality and are far from perfect, 
substantial progress has been made in both 
teacher evaluation and inthe prediction of 
teacher effectiveness. Inthe prediction 
of teacher effectiveness, where multiple 
regression equations were employed, mul- 
tiple R’s ranging from .70 to . 80, were 
secured. One should remember, of course, 
that prediction depends upon the criteria. 
As more variables are added the standard 
error of the coefficient of correlation al- 
so increases. Before more progress can 
be made, first of all, amore adequate 
criterion must be developed; then there 
must be better data-gathering dévices; 
and blind spots inteacher efficiency must 
be filled in. Coefficients of correlation 
of . 70 to . 80 are quite satisfactory for 
some purposes if the standard error can 
be reduced. As far as this author knows, 
at no place in either psychology or educa- 
tion, notwithstanding the tremendous 





143 


amount of research, has anyone been able 
to predict effectiveness in a complex field 
like teaching with greater accuracy. As 
has been pointed out elsewhere, it is en- 
tirely conceivable that the constituents of 
teacher effectiveness are not additive and 
that better measures of effectiveness and 
prediction can be had by new concepts of 
the structure of teaching ability. By the 
use of the insights andthe techniques here- 
in presented, carefully and cautiously as 
one should, one can expect to improve up- 
on current practice. 


. The evaluation of human efficiency at 
whatever leveland for whatever purposes 
is an exceedingly complex necessity which 
needs to be made with extreme care. To 
secure accurate evaluations, one must 
utilize every known check on accuracy, 
such as multiple criteria until different 
criteria can be shown to give similar re- 
sults or other criteriacan be shown to be 
untenable witha variety of data-gathering 
devices all chosen because of their pre- 
sumed validity and coverage, and more 
than one evaluator who will employ data 
collectedover some periodof time. Much 
of human import depends upon the accuracy 
ofteacher evaluation. Some would not 
evaluate teachers, but evaluation is ines- 
capable, thatis, they are generally made, 
whether made openly and carefully or 
made subversively and haphazardly. It 
has been the purpose of the investigations 
here summarized to clarify some of the 
problems andissues associated with valid 
and reliable evaluations. 


IV. Some More Specific Findings 





The findings here summarized are those of 
single investigations of some few studies as con- 
trasted with the generalizations just stated which 
were generally based upon several studies. 

1. Barr (3) in a study ofthe behavior of good 
and poor teachers found no teacher behaviors that 
distinguished one group from theother. He con- 
cluded that there were apparently critical factors 
and contributing factors in teacher effectiveness 
and that there was an appropriateness aspect to 
teacher activities that must be taken into con- 
sideration in teacher evaluation. Behavioral data 
might be used as a basis for inferences drawn 
about the personal and professional prerequisites 
to effectiveness. 

la. Butler (15) defined the competencies nec- 
essary for effective teaching in the realm of 
classroom management based upon published re- 
search. 


2. Jayne (37) from two parallel studies of the 





144 JOURNAL OF EXPERIMENTAL EDUCATION 


relationship between specific teaching procedures 
and pupil gain, found no single procedure to be 
significantly correlated with pupil gain in both 
studies. He concluded that procedures are pos- 
itively, negatively, or not at all related, depend- 
ing upon purposes and conditions. 

3. Brookover (12) found that teachers with a 
high degree of person-to-person interaction with 
their students tend to be rated high as instructors 
by these same pupils (r = . 64), but these evalu- 
ations correlated near zero (r = .08) with admin- 
istrators’ ratings. 

4. Von Haden (82) found that personal subjec- 
tive data made a substantial contribution to the 
accuracy of predictions of teacher effectiveness. 

5. Grotke (31) found that the data of his study 
did not fully support the hypothesized relationship 
that professional distance between the raters and 
ratees might be related to merit rating. 

6. La Duke (45) foundthat the intelligence of 
seventh- and eighth-grade teachers in one-room 
rural schools correlated significantly with pupil 
gain as the criterion (r =.61). Other studies 
secured conflicting results arising from a differ- 
ent set of conditions. 

7. McCoard (55) found a speech factor in 
teacher effectiveness 

8. Rostker (66) found statistically significant 
coefficients of correlation bet ween twelve tests 
of intelligence, knowledge of content, mental 
health and adjustment, and various attitudes with 
a pupil gain criterion and non-statistically sig- 
nificant correlations betweenfive other tests and 
five teacher rating scales 

9. Von Eschen (81) reports statistically sig- 
nificant changes inteachers under supervision in 
social attitudes as measured by the Hartmann, 
Social Attitudes of Secondary School Teachers, 
Part II, and in teacher-pupil relations as meas- 
ured by the Torgerson Test of Teacher- Pupil Re- 
lationships. 

10. Blum (10) found that students preparing 
to teach did not differ significantly from those 
preparing for four other professions as measured 
by the Minnesota Multiphasic Personality Inven- 
tory, but that they did differ in their vocational 
and non-vocational interests as measured by 
the Strong Vocational Interest Blank. 

11. Jackson (36) constructed a test of social 
competency and studied its relationships to teach- 
er effectiveness. 

12. Schick (69) attempted unsuccessfully to 
validate a teacher judgment scale which corre- 
lated with a criterion of supervisory ratings, 
Fs oaks 

13. Lamke (46) from the use of the Cattell 
16 PF Test found thatthe responses of good and 
poor teachers did not fall into two we11-defined 
patterns, but that there was more than one pat- 
tern for each. 


14. Manwiller (52) studied the expectancies 





of school board members and found large areas 
of agreement with teachers. 

15. Stoelting (75) found for University of Wis- 
consin graduates that if the minimum grade-point 
average were increased from the 1.3, now em- 
ployed, to 1.5, formerly employed, the higher 
minimum would screen out 13 of the 24 who were 
judged to be less than average in teaching ability 
but at the sametime eliminate 31 who were aver- 
age, 7 who rated above average, and one who 
rated superior. 

16. Martindale (53) thoughtthat there were a 
number of specific situational factors that were 
critical in teacher effectiveness. 

17. Jones (41) foundfrom ause ofthe discrim- 
inate function a significant difference between the 
group of good teachers and the group of average 
teachers (none were judged to be poor) in terms 
of the profile of group means, with the greater 
differences found in 1) grade-point averages in 
professional courses; 2) grade-point average in 
courses related to the major; 3) flexibility in nu- 
merical ability; 4) dispositional rigidity as meas- 
ured by an Alphabet Test; and 5) a general activ- 
ity personality variable. 

17a. Knox (44) found a low positive relation- 
ship between efficiency ratings and various situ- 
ational factors. The most dissatisfied teachers 
received the lowest rating and the least dissatis- 
fied (i. e., the most satisfied) the highest ratings. 

18. Flanagan (26) and Rieck (62) found, in 
general, that selective factors had removed those 
with critical scores on the Minnesota Multiphasic 
Inventory but that individual differences still 
existed. 

19. Torrence (79) found no statistically sig- 
nificant correlation betweenteacher effectiveness 
as he measured it andthe vocational agricul- 
ture teacher’s knowledge of technical agriculture, 
agricultural manipulative skills, knowledge of 
professional education or combinations of these. 

20. Bach (2) found a negligible relationship 
between practice teaching grades in the Wiscon- 
sin High School and the criterion of supervisory 
ratings. 

21. Gotham (30) concludes that one ofthe sig- 
nificant findings of his investigation was the lack 
of agreement found among the several criteria: 
between pupil change and tests of qualities com- 
monly associated with teaching success (.13); 
between pupil change and composite of scores on 
personality scales and tests (. 27); and between 
personality measures and tests of qualities com- 
monly associated with teaching efficiency (. 32). 

22. Doane (18) found in a study reported in 
1947 that the eight most frequently required 
courses in education were student teaching, edu- 
cational psychology, principles of teaching, gen- 
eral psychology, methods ofteaching one or more 
of the school subjects, principles of secondary 
education, introduction to education, and educa- 





tional measurements. One may assume that such 
courses express in partthe competencies expect- 
ed of teachers. 

23. Best (9) found that candidates for the 
teaching profession reveal adefiniteness of pur- 
pose relative to and a sincere respect for teach- 
ing as a profession. These respondents gave for 
their reasons for choosing teaching as a profes- 
sion: interest in children and young people, op- 
portunity to work in a field of major interest, 
offers a life-long opportunity to learn, like to 
work with people, to serve society, a good job to 
fall back on in an emergency, long vacations, 
travel, study, and an assured income. 

24. Schmid (70) foundthe factor patterns for 
women to be somewhat different than those for 
men. 

25. Riesch (63) foundthat the Washburne So- 
cial Adjustment Inventory, the Wood Right Con- 
duct Test, andthe California Test of Mental Ma- 
turity were of little value in predicting the effec- 
tiveness of ruralor city seventh- and eighth-grade 
teachers. 

26. Thiede (76) found that women students 
seeking the teachers’ certificate atthe University 
of Wisconsin scored nine percentile points above 
the men on high-school rank and four percentile 
points below them on the psychological examina- 
tion, but exceeded other undergraduates at the 
University of Wisconsin in twenty-two of twenty- 
seven comparisons. The men did not fare quite 
as well depending uponthe area of specialization. 

27. Schwartz (72) found that of factors meas- 
ured by nineteen Objective Tests of Primary 
Source Trait, only three showed a statistically 
significant relationship with more than one cri- 
teria. Two-Hand Coordination and Reaction time 
gave statistically significant correlations with 
supervisory rating of teacher effectiveness in the 
field of teaching. 

28. Hampton (33) found that teachers were 
generally rated highest in cooperation and loy- 
alty, courtesy and friendliness, and lowest in 
knowledge of subject matter, general culture, re- 
sourcefulness, andspeech. Resourcefulness and 
knowledge of subject matter seem to be the qual- 
ities most frequently sought by Iowa superintend- 
ents. 

29. Ringness (64) concluded that a teacher’s 
success is related to the nature of the reasons 
for the choice of teaching as a profession. He 
further found that an overall acceptability rating 
growing out of aninterview with the superintend- 
ents and an aver ageofthree ratings of teacher 
efficiency correlated .90 for men and. 82 for 
women. 

30. Lien (48) concluded that differences exist 
between students in the various undergraduate 
curricula at the Wisconsin State College from 
which his subjects weredrawn. Differences were 
found in interests and personal qualities as well 





as abilities. 

31. Erickson (24) concluded from a factor 
analysis of nine measures of teaching success 
that there was no general factor, but three factors 
measuring different things. The prediction of 
teaching success as measured by these three com- 
posite criteria, i.e., from 7 measures of tem- 
perament (Thurstone), 16 measures of personal- 
ity (Cattell), and10 measures of pre-service 
achievement and intelligence was not successful, 
the correlations ranging from -. 28 to . 35. 

32. Anderson (1) employed eight criteria of 
of teaching success among which no statistically 
significant correlations were found. 

33. Montross (57) found that five objective 
measures of temperament, namely, a) speed of 
tapping, b) reaction time, c) fluency (adjectives), 
d) right- and left-hand coordination, and e) maze 
number 1 correlated significantly with a compos- 
ite of five supervisory ratings, the correlations 
ranging from . 39 to .57 with a multiple R of .73. 
In general, his findings are in agreement with 
those of Schwartz. 

34. Singer (73) found for social competency, 
and a social introversion score derived from the 
Minnesota Multiphasic Personality Inventory, and 
a Teacher- Teacher Social Distance Score moder- 
ately high coefficients of correlations in almost 
all cases witha composite rating criterion of 
teaching success. 

35. Kline (43) concluded that the satisfactions 
and annoyances that attend teaching relate to 
teacher effectiveness. 

36. Barr (4) provided a definition of the 
teaching process, and indicated in some detail 
the sorts of data and data-gathering devices that 
one might use in the study of teachers and teach- 
ing. 

37. Stevens (74) developeda projective tech- 
nique that she employed in a study of the percep- 
tions of high-school seniors of teachers and 
teaching. 

38. Torgerson (78) was among the first to 
use a pupil gain score as a criterion of teacher 
effectiveness. He later studied teacher-pupil re- 
lationships and the mental hygiene of the class- 
room situation. 

39. Schoonover (70a), Sigurdson (72a), and 
Struck (75a) made early studies of the objectivity 
and reliability of behavioral studies of teachers 
and teaching. 


V. Data-Gathering Devices that Appear to 


Warrant Further Consideration 
Collectively the investigations here summar- 
ized have em ployed a very large number and a 
very great variety of data-gathering devices. 
Some of these devices were especially construct- 
ed for the particular investigations here summar- 
ized and some were chosen from the literature. 





146 JOURNAL OF EXPERIMENTAL EDUCATION 


All of the devices have come into being to meet a 
special need as seen by one or more groups of 
persons. All possess validity in varying amounts 
as can be seenfrom the data relative to each pro- 
vided in an earlier chapter. 

One of the aims of the investigations here 
sum marized was to provide a more or less com- 
mon treatment for a number of devices thought to 
measure teacher effectiveness, with a common 
criterion, and thus supply data on their validity 
under a new set of conditions as described in the 
investigations here summarized. Many of the de- 
vices were employed in two or more of the inves- 
tigations. The list to follow is composed of de- 
vices that appear to have promise according to 
1) correlational data with reference to each pro- 
vided in the investigations here summarized, and 
2) face validity and impressions gained from the 
actual use of the various devices in a variety of 
situations. 

In this connection, it should probably be 
pointed out that correlational data are difficult to 
interpret, particularly inasmuch as coefficients 
of correlation cannot be taken at face value. The 
size of the coefficient of correlation depends upon 
many things including only in part the amount of 
relationship that may exist between a variable 
and a criterion. To acertain extent the size of 
the coefficient of correlationdepends upon errors 
of measurement. It also depends upon selective 
factors that have operated through the processes 
of school education, and the spread of talent that 
remains for the groups studied. As a result of 
many things, an obtained coefficient of zero, for 
example, may not mean that there is no relation- 
ship between a given variable and a criterion but 
merely that certain conditions exist. If the test’s 
reliability appears adequate forthe uses to which 
it is put, it may be concluded that other factors 
have operated to eliminate any observable rela- 
tionship. 

With these limitations in mind, and with many 
reservations, anarbitrary cut-off point has been, 
nonetheless, set for each correlation as reported 
in these investigations; while the cut-off point 
might be set at any one of a number of points it 
was here arbitrarily set at r = . 36 by reasoning 
such as the following: 

If these were sampling studies, which they 
are not, and the samples ranged in size from 30 
to 50 subjects, which they do, with some excep- 
tions, a coefficient of correlation of .36 would be 
statistically significant at somewhere between 
the one and five percent levels, roughly estimat- 
ed. Then, if one were employing the coefficient 
of predictive efficiency, which is a descriptive 
statistic, thenthe predictive efficiency for r = . 36 
depending upon whether the coefficients are round- 
ed to one, two, or three decimal points is rough- 
ly seven percent. Another part of the logic back 
of the recommendations to follow is that of con- 


sistency, i.e., the number of times a particular 
measure appears above the cut-off point in rela- 
tion to the number of opportunities that a meas- 
ure has to appear. Rank among other measures 
of its kind may be more meaningful than the ac- 
tual size of the coefficient of correlation. 

Inasmuch as these investigations have been 
conducted over a period of years, many of the in- 
struments are now out of date, or out of print, and 
superseded by other measures, but the types of 
instrument found have a significant relationship 
suggestive for furtherresearch. The list may be 
helpful merely in that they indicate the sorts of 
things that appeared promising in light of the cri- 
terion used. Instruments supported by a single 
investigation are not reported. Many of the in- 
struments can be fairly described as unfinished 
business. 


Data-Gathering Devices that May Contain 





Suggestions for Further Research 








1. Measures of Cognitive Processes 
a. Grade point average for all undergraduate 
courses, for courses that compose the 
teaching major, andfor the teaching minor 
plus subject matter tests over the content 
taught. 

. Grade point average in professional 
courses. 

. Rank in high-school class (uncorrected for 
size of school and other factors). 

. Thurstone’s Psychological Examination 
for College Freshmen, American Council 
on Education, Washington, D. C., 1936, 
1938, 1939 editions. 

. National Teachers Examination, Educa- 
tional Testing Service, New York, 1941 
edition. 

Teachers College Psychological Examina- 
tion, State Teachers College, St. Cloud, 
Minnesota. 
2. Measures of Affective Processes 

a. Yeager, Scale for Measuring Attitude 
Toward Teachers andthe Teaching Profes- 
sion, Teachers College, Columbia Univer- 
sity, 1935. 

. The Minnesota Teacher Attitude Inventory 
by W. W. Cook, Carroll H. Leeds, and 
Robert Callis, Psychological Corporation, 
Washington, D. C., 1951. Originally de- 
scribed by Leeds and Cook in ‘‘The Con- 
struction and Differential Value of a Scale 
for Determining Teacher-Pupil Attitudes,’’ 
Journal of Experimental Education, XV 
(1947), pp. 149-59. 

. Hartman, Social Attitudes of Secondary 
School Teachers, John Dewey Society, New 
York, 1937. 

3. Measures of Special Skills 
a. Teacher-pupil relationships. Torgerson 




















Test of Teacher- Pupil Relations, unpub- 
lished materials, University of Wiscon- 
sin, 1930. 


. A speech proficiency test (per McCoard) 


to cover theordinary mechanics of effec- 
tive speech, fluency, vividness, etc. 


4. Measures of Personal Characteristics and 
and Personality 


a. 


Composite ratings on qualities such as 
initiative and professional judgment made 
by three readers of autobiographies writ- 
ten by undergraduate students. 


. Torgerson, Theory and Practice of Men- 





tal Hygiene, unpublished materials, De- 
partment of Education, University of Wis- 
consin. 


>. Washburne, Social Adjustment Inventory, 


Sapich Edition, Syracuse University. 


. The Schwartz Teacher Temperament Scale 


(based upon data provided by Cattell on 
character and personality), unpublished 
materials, Department of Education, Un- 
iversity of Wisconsin. Described by 
Schwartz in the Journal of Experimental 
Education, XIX September 1950. 





. The Minnesota Multiphasic Personality In- 


ventory, by Starke R. Hathaway and J. 
Sharnley McKinley, Psychological Corpor- 
ation, Washington, D.C., Revised Edition, 
1951. 

The Guilford-Zimmerman Temperament 
Survey, Sheridan Supply Co., Beverly 
Hills, California, 1949. 


Unfinished Business 





. Barr, ‘‘Teacher Perception of Personal 


Characteristics, ’’ unpublished material, 
Department of Education, University of 
Wisconsin. Described in School Review, 
LXVIII (Winter 1961), pp. 400-408" 


5. General Merit Rating 


a. 


Almy-Sorenson Rating Scale for Teachers. 
Public School Publishing Co. , Blooming- 
ton, Ill., 1930, described in Educational 
Administration and Supervision, XVI 
March 1930, pp. 179-86. 





. Michigan Education Association Teacher 


Rating Card. Michigan Education Asso- 
ciation, Lansing, Michigan, 1930. 


. Torgerson Diagnostic Teacher Rating 


Scale. Public School Publishing Co., 
Bloomington, Ill., 1930. 


. Wisconsin Adaptation ofthe M Blank of the 


Evaluative Criteria, unpublished material 
Department of Education, University of 
Wisconsin. (Effectiveness is inferred 
from observable teacher and pupil behav- 
ior on various aspects of the teaching pro- 
cess with an overall merit rating. ) 





e. 


147 


A prediction made upon the bases of a 
composite of two structured inter views 
made during the senior year in college to 
secure information relative to profession- 
al judgment; social ability; work habits; 
motivation; and originality. 


Unfinished Business 


Barr-Harris Teacher Performance Rec- 
ord. Dembar Publications, Madison, Wis- 
consin, 1943. A sequential record of 
teacher and pupil behavior from which ef- 
fectiveness is inferred and given a point 
score. 


Vi. Needed Research 


More research is neededto clear up many un- 
solved problems relative to the measurement and 


prediction of teacher effectiveness. 


A short list 


of the most pressing problems is given below. 


A 


One of the problems that needs clarifica- 


tion, possibly before allothers, is that of the cri- 


terion of teacher effectiveness. 


This problem is 


complex with many sub-issues: 


a. 


First, there must be an acceptable theo- 
ry of the nature and structure of teaching 
ability. Since the turn of the century many 
thoroughly competent persons have given 
prolonged attentiontothis problem. In its 
more general aspects Spearman, Thur- 
stone, Thomson, Kelley, Burt, Gilford, 
Cattell, andmany others have studied this 
problem. The prevailing theory is an el- 
ementalist factor theory withthe chief dif- 
ferences among specialists arising over 
the actuality of general ability, group fac- 
tors, and certain unnamed specifics. The 
elements are presumed to be additive. A 
hypothesis advanced in this summary is 
that possibly the ingredients of effective- 
ness are not additive and an inadequacy in 
any one of a number of critical factors 
may make the teacher thoroughly unac- 
ceptable to the administration, the com- 
munity or other teachers and these atti- 
tudes may bring about the immediate re- 
moval of the teacher or may create a cli- 
mate in which her usefulness or effective- 
ness is reduced, notwithstanding her abil- 
ities in other respects. The factors which 
are considered critical and the tolerance 
permitted may vary from com munity to 
community. 

So far we have been speaking of struc- 
ture. Neither is one certain whether all 
the parts have been, as yet, made known. 
If one thinks of teachers as machines, 
whichof course they are not, we frequently 
see these machines in operation, but no 





JOURNAL OF EXPERIMENTAL EDUCATION 


one has, as yet, put one together success- 
fully, notwithstanding the many attempts 
to do so. Itis not clear whether there are 
essential parts still missing or whether 
the parts have not been assembled as they 
should be. 

. Second, any theory of teacher effective- 
ness to be tested, involves data-gathering 
devices. Operationally, concepts of hu- 
man abilities are no better thanthe instru- 
ment that one employs in identifying them. 
Apparently there are quite a few consti- 
tuents toeffectiveness, andthus, possibly 
a very large number of data-gathering de- 
vices. Imperfectionin any one data-gath- 
ering device may make inoperative the 
whole model. While, obviously, much 
progress has been made in measuring the 
constituents of teacher effectiveness, 
there is muchmore that needs to be done. 

2. Two models, in general, have been em- 
ployed in the study of teacher effectiveness. One 
is the carefully controlled model, and the other 
the uncontrolled natural events models. These 
models differ in many respects but attention is 
directed here totwodifferences. First, there is 
ordinarily adifference between thetwo models in 
their inconclusiveness, the controlled model be- 
ing traditionally limited to the study of a few or 
a single aspect of teacher effectiveness, as in 
controlled experimentation, and the uncontrolled 
natural events model being traditionally employed 
in studies which attempt to put together in a single 
meaningful whole some large segment of the fac- 
tors thought to determine teacher effectiveness. 
Second, these models attempt to eliminate the in- 
fluence of situational factors differently, one by 
equating effects or holding them constant, the 
other by measuring their influence and eliminat- 
ing or holding constant their effects through sta- 
tistical manipulation. There is great need for 
the integration of these two approaches to teach- 
er effectiveness. All too frequently the users of 
these two different approaches procede with min- 
imum awareness of eachother. They may even 
come from different disciplines and employ dif- 
ferent vocabularies. The controls, whether ex- 
perimentally or statistically imposed, should or- 
dinarily not differ in intent, as shown by prior 
research. Verbal acceptance of this union of con- 
trolled experimentation and statistics is not 
enough. We need empirical tests of their going to- 
getherness. 

3. It has been frequently pointed out in this 
monograph that different people employ different 
criteria and approaches to the evaluation of teach- 
er effectiveness. These differences arise in part 
out of differences in the theories of the nature of 
human abilities that different people hold but 
more frequently from personal preferences, some 
preferring to approach effectiveness from the 





point of view of its personal prerequisites, some 
from the point of view of teacher- pupil behaviors, 
some from the pointof view of basic knowledges, 
attitudes, and skills, and some from the point of 
view of products. The data in the investigations 
here summarized seem to show, with a high de- 
gree of consistency, that these different approach- 
es give different answers to the question: Is this 
teacher an effective teacher? Research needs to 
be undertaken to resolve these differences. 

4. Frequent reference is made in this sum- 
mary to the problem of vocabulary. It would 
seem that every researcher has his own private 
system of talking about teacher effectiveness. Re- 
search in these areas could be greatly facilitated 
if these different vocabularies could be harmo- 
nized. From a study of the many researches in 
this areait would seem that difficulty arises only 
partially out of personal preferences in the realm 
of vocabulary, but more generally because each 
researcher characteristically chooses a different 
emphasis or aspect of the teacher effectiveness 
for investigation which, ofcourse, are in a man- 
ner necessary if progress is tobe made. Pos- 
sibly the difficulty, whichis acute from the prac- 
titioner’s point of view, could be eased somewhat 
if competent scholars from different disciplines 
would be willing to turn their know-how to the de- 
velopment of verbal equivalents. 

5. One of the very real difficulties confront- 
ing the worker inthis area is the uncompleted as- 
signment. There are, one can safely assume, 
hundreds of factor analyses, experimental inves- 
tigations, and data-gathering devices initiated by 
doctoral candidates that have never gotten beyond 
the first exploratory or preliminary stages. Many 
of these researches contain very good first ideas 
but no one has followed through; as a result we 
have data-gathering devices that are good but not 
good enough, and experimental and statistical 
studies that give conflicting results that no one 
has bothered to resolve. In many instances, the 
conflicting results arise from changes in impor- 
tant ingredients from one investigation to another, 
which appear on the surface tobe inconsequential 
but which produce important differences. But 
some of the conflicting evidence arises over mat- 
ters that might be eliminated with more care and 
continued research. There is much needed re- 
search in the area of the unfinished task. 

6. One of the early and continuing concerns 
of the investigations here summarized was that of 
teacher-pupilrelations. The interaction of teach- 
ers and pupils in the situation in which teaching 
takes place is the focal point of teaching and one 
of the critical factors in teacher effectiveness, 
whatever the objective. The early concern was 
with subject matter objectives; gradually the em- 
phasis has shifted from subject teaching per se to 
include method of studying, working, and thinking, 
and pupil adjustment. Most of the current con- 





cerns are with on-the-spot interactions as they 
relate to these latter concerns. A matter that 
may be of increasing concern is the permanent 
effect that results from these teacher-pupil inter- 
actions and their impact upon personality, which 
arises slowly and from many influences: home 
and parental influences, peer influences, and the 
more general cultural influences that arise in the 
environment in which youth grow into adulthood. 
It will not be easy to isolate the teacher influ- 
ences in this respect, but research is needed to 
clarify the situation. 

7. One of the insights developed from these 
investigations is that specific teacher acts may 
not be right or wrong in and of themselves, taken 
individually, andthat there are many alternatives 
one about as good as another. But one may use 
behavior records in drawing inferences about im- 
portant prerequisites toteacher effectives. These 
inferences may relate to the personal prerequi- 
sites to teacher effectiveness, to the profession- 
al prerequisites, or to any and all aspects of the 
learning-teaching situation. This has been the 
focal point of classroom visitation and conference 
from the very earliest conception of supervision. 
The crucial problem is not that we be concerned 
with them but how to render these inferences 
more valid andreliable. Traditionally, these in- 
ferences have been very subjective and personal. 
The problem is: How does one order these oper- 
ations to bring them more generally in keeping 
with the conventions of science? There is much 
needed research in this aspect of teacher effec- 
tiveness. 

8. There has been much research on the per- 
sonal prerequisites to teacher effectiveness. 
When these prerequisites are operationally (be- 
haviorally) defined they may be inferred from 
studies of observedteacher behavior. There are 
reasonable needs here. First, much of the re- 
search pertaining to the classroom behavior of 
teachers has beendone relative to some few ideas 
such as teacher-centered versus pupil-centered 
teaching, democracy versus autocracy inthe 
classroom, and permissiveness versus restric- 
tiveness. The range of interest should be broad- 
ened to include alarger group of the personal and 
professional prerequisites to teacher effective- 
ness. On the personal side, the concerns should 
be extended to include source traits as well as 
surface traits; and onthe professional side to in- 
clude those characteristics that appear to differ- 
entiate teaching from other occupational groups. 
Second, teacher behaviors should be related to 
those principles of learning that have so long been 
the concernof educational psychology. The teach- 
ers use of these may also be inferred, when op- 
erationally defined, from studies of teacher be- 
havior. There is much needed research in this 
area. 

9. More attention needs to be given to the 





149 


critical factors in teacher efficiency. There are 
an infinite number of things that teachers should 
do, and were they done well they would add 
‘*class’’ and ‘‘polish’’ to the teachers perform- 
ance, but they are not basically essential. They 
are not critical. Some knowledge of the subject 
taught, or interest in teaching, or in pupils, or 
professional know-how, would seem to be essen- 
tial to teacher effectiveness. What is the essence 
of goodteaching? There is much knowledge of 
teaching that is more or less extrinsic; it would 
economize upon the effortsof many people if some 
way, somehow, the usable could be separated 
from the non-essentials. This is a matter that 
will demand close examination of the evidence at 
hand and the planning of new research to sepa- 
rate the moderately useful from the essential. 

10. There are a lot of people that have ex- 
pectancies relative tothe teacher: board mem- 
bers, superintendents, principals, supervisors, 
other teachers, pupils, parents, and other mem- 
bers of the community, and they have expectan- 
cies about almost everything a teacher does. 
Some of these expectancies would seem to be quite 
essential to a teacher’s success in a given school 
or community. Can these expectancies be more 
systematically spelled out and instruments devel- 
oped for identifying them in particular teaching 
situations ? 

11. It is difficult to find good definitions of 
teaching. Teachers do many different sorts of 
things; they teach the different school subjects to 
different sorts of pupils at different grade levels; 
teachers are presumably friends and counselors 
of students, they may be directors of extra-cur- 
ricular activities of many sorts, both inside the 
confines of school and without, they are members 
of various professional and com munity groups; 
and presumably good citizens. Charter provided 
a definition of teaching in the twenties; there is 
need for a definition of teaching in the sixties. 

12. Barr (4) provided, in the early thirties, 
a definition of the teaching act as seen through the 
eyes of the supervisor and as appliedto classroom 
supervision and the improvement of teaching. 
There is need of such an analysis of the teaching 
process prepared for the use of teachers to guide 
them in their efforts at self improvement. Pos- 
sibly one of the marks of a good teacher is her 
understanding oftheteaching process. Some stu- 
dents of teaching believe that many teachers think 
of teaching techniques as timeless and universally 
applicable regardless of context rather than as 
parts of some sort of on-going process defined by 
purposes, persons, andsituations. Possibly tests 
that probe the teacher’s comprehension of the 
teaching process in all of its many ramifications 
is anessential ofteacher effectiveness. Research 
in this direction might prove profitable. 

13. Teachers frequently achieve far below 
their potentialities. Inthe absence of more infor- 





150 JOURNAL OF EXPERIMENTAL EDUCATION 


mation about the aspects of each learning-teaching 
situation that may limit or facilitate the teaching 
performance, prediction of teacher effectiveness 
based upon potential are likely to be quite inaccu- 
rate regardless of their accuracy as predictions 
of potential. Teaching is limited and facilitated 
by many aspects of the situation: each teaching 
assignment places in some respect different de- 
mands uponthe teacher; the expectancies are dif- 
ferent for different schools and communities; the 
teacher morale is very high in some schools and 
communities and low inothers; the administrative 
staff and its philosophy of administration may be 
more acceptable to some teachers than to others; 
there are many annoyances and satisfactions that 
attend teaching, more in some communities than 
others; some teachers find more professional 
compatability in some schools and communities 
than in others. Research is needed to spell out 
aspects Of the situation that may limit or facili- 
tate teaching up to potential. 

14. Given a valid criterion of teacher effec- 
tiveness, one of the best tests of one’s command 
over the constituents of effective teaching is 
one’s ability to improve some particular teacher’s 
effectiveness through the manipulation of vari- 
ables thought to be associated with effectiveness. 
Demonstrations of this command are legion, but 
frequently not convincing because of inadequacies 
in the theoretical structure out of which such re- 
searches grew: inadequate definitions of the prob- 
lems to be studied, inadequate definition of im- 
portant vocabulary, inadequate data-gathering 
devices, inadequate research design, inadequate 
controls, andinferences not supported by the data 
collected. There is need for valid demonstra- 
tions of our ability to produce effective teachers 
upon demand. 

15. Much of the research relative to teacher 
effectiveness pertains to a particular world of 
realities in whichteachingtakes place. Fortune- 
ately, or unfortunately, teachers do not always 
live in this world of reality. They have their own 
constructs by which they live. Thus the impor- 


tance of researchrelating tothe teachers percep- © 


tion of almost any and all aspects of teaching, 
purposes, persons, and situations. Important 
beginnings have been made in self-perception and 
the perceptions of others with reference to the 
personal prerequisites to teacher effectiveness. 
But inthe absence of convincing research there 
is no particular reason for assuming that her per- 
ceptions in one area are any more important than 
in another. These perception studies need to be 
extended to other important components of teach- 
er effectiveness. 

16. The foundations for effectiveness will be 
found in the biophysical sources of energy and 
drive. The super structures built upon these 
foundations have been extensively explored, but 
there is need for careful examination of the foun- 





dations. These foundations will be found in the 
genes, bodily structures, and physiological func- 
tioning: the glandular system, metabolism, the 
circulatory system, the nervous system, the spe- 
cial senses, and all those aspects that make the 
individual an efficient entity. Individuals differ in 
all of these respects. What are the critical ele- 
ments in these biophysical foundations of human 
behavior? Few educationalists, or psychologists 
for that matter, canbe said to be experts in these 
biophysical determiners of human effectiveness 
but more information is needed about them. 

17. Skill in speech is closely associated with 
teacher effectiveness. Many studies have now 
been made of fluency andthe commonplace verb- 
al processitself. Itis not enough to merely study 
verbal behavior. What is verbal behavior? How 
does it relateto thinking and other bodily func- 
tions? Teachers use many words in the run of a 
day’s teaching. These words used by teachers 
have different meanings from time to time and 
from person to person. Some of these meanings 
are fairly adequate, correct and complete; some 
are not. In these daily vocabulary exercises 
words are treated as substitutes for reality. There 
is need for studies of the words by which teachers 
live. 

18. The teacher makes many judgments in the 
run of a day’s teaching: judgments about pupil 
needs and the means to be employed in satisfying 
the judgments about rewards and punishments; 
judgments about standards of achieving and the 
means of assessing them; judgments about indi- 
vidual differences and what to do about them; judg- 
ments about learning aids and how to use them; 
judgments about pupil control, and the many other 
things associated with teaching. The teacher ef- 
fectiveness is closely associated withthese minute 
to minute decisions made by teachers. More in- 
formation is needed about them and how to assess 
their appropriateness. 


VII. Some Observations on Teacher 
Evaluation Programs 








1. Some would like to escape evaluation. This is 
not, however, very likely. It would seem that 
teachers have always been evaluated; they are 
now evaluated, and they will probably continue 
to be evaluated as long as there are teachers. 
The problem is how to bring these evaluations 
in the open and to improve their accuracy. 


. It should be reasonably clear that teacher eval- 
uation is an exceedingly complex matter and 
those that engage in such activities should be 
aware of its complexity, of the possibilities of 
arriving at erroneous judgments, and of the 
consequences that follow from such evaluations. 


. There is plenty of evidence to indicate that dif- 
ferent practitioners observing the same teach- 





er teach, or studying dataabout her, may ar- 
rive at very different evaluations of her; this 
observation is equally true of the evaluation 
expert: starting with different assumptions, 
employing different approaches, and using 
different data-gathering devices they, too, 
may arrive at very different evaluations. 


It is assumed that each school system may 
prefer to develop its own plan for evaluating 
teacher effectiveness, taking into consider- 
ation local needs, attitudes, and insights. 
The attitudes and the insights of the partici- 
pants are important items in the success of 
any planofteacher evaluation. It is ordinar- 
ily best to start on an experimental basis. 


Most evaluators attempt to make judgments 
about small differences in effectiveness that 
do not seem to be possible at the present 
time, considering current know-how and 
data-gathering devices. Possibly for the 
time being it might be best to attempt to set 
up only broad categories of teacher effec- 
tiveness, such as adequate, superior, and 
inadequate, and to do this with reference to 
pretty carefully defined situations. 


Those who set up evaluation programs should 
probably keep in mind the fact that evalua- 
tions are made for different purposes, such 
for example, as for purposes of teacher- 
certification, employment, improvement in 
service, and for fixing salary schedules, done 
at different points ina time sequence, and 
under varying sets of conditions. And these 
different purposes may make a difference in 
the teacher evaluation program. 


There are different approaches to evaluation. 
Some would evaluate in terms of the basic 
prerequisites to teacher effectiveness: 
knowledges, skills, and attitudes; some in 
terms of teacher performance: behaviors 
and activities; some, intermsof the person- 
al prerequisites toteacher effectiveness; and 
some in terms of pupil growth and achieve- 
ment. Each approachhas its advantages and 
disadvantages. 


There are many different sorts of data-gath- 
ering devices employed inteacher evaluation: 
observation ofteachers at work, unaided and 
aided by instrumentation such as recording 
devices, check lists, rating scales, and the 
like, tests of qualities thought to be associ- 
ated withteacher effectiveness; question- 
naires and interviews directed to the teacher 
or others that may be acquainted with the 
teachers work; documents and records of 
various sorts, including data about the fore- 
going autobiographies, and the like. From 
these sources one may collect data of vary- 





151 


ing validity andreliability. The data will not 
be perfect. 


E valuations may be made by many people: 
superintendents, principals, supervisors, 
college professors, other teachers, pupils, 
and parents with different concepts of teach- 
er effectiveness, varying amounts of training 
in the handling of data, and with different 
levels of professional sophistication. They 
frequently have a different perception of 
teaching and evaluate teachers differently. 


For the time being, it would seem best, at 
least until the situation has stabalized, to em- 
ploy more than one approach to teacher edu- 
cation, and a variety of data-gathering devices 
chosen for their known validity and reliabil- 
ity, with data collected over some period of 
time and assessed by more than one person. 
Programs for the careful training of evalu- 
ators have been shown to be effective. 


The evaluation of a teacher’s effectiveness, 
when properly done, is a time-consuming ac- 
tivity, and when made with due regard to its 
complexity may better be done not annually, 
as now pursued, but merely from time to time 
as a need arises, andat critical points in the 
teaching cycle as for example at the time of 
employment, atthetime of the termination of 
a probationary period of employment, or at 
times when a major change in status is con- 
templated. 


Consideration should be given to the collec- 
tion of data about such basic prerequisites as 
A. Knowledges 
a) General cultural background 
b) Knowledge of subject taught (or activity 
directed) 
c) Knowledge of childdevelopment, behav- 
ior and learning 
. Attitudes 
a) Interest in subjects, pupils, and teaching 


b) Social attitudes and values 
c) Motivation 
. Skills 

a) Skill in communication 

b) Skill in teacher-pupil relations 
Consideration should be given to 
A. Personal fitness 
B. Professional competency, as inferred 

from a) systematic studies of teacher- 

pupil behavior and conditions inthe class- 

room, and from b) tests and other data- 

gathering devices pertaining to these. 
Consideration should be given to the products 
of teacher leadership 1 ih a8 
A. As director of learning 

a) Informational learning 

b) Attitude changes: interest in the sub- 














JOURNAL OF EXPERIMENTAL EDUCATION 


ject taught; attitudes peculiar to the 
subject taught; attitudes shared with 
other subjects 
c) Special skills peculiar to the subject 
taught; shared with other subjects 
d) Personality development and adjust- 
ment 
B. As a friend and counselor of pupils 
C. As a member of school community 
D. As a member of groups of professional 
workers 
In collecting data relative to the foregoing, 
remember that data-gathering devices are 
highly fallible: the title given to the instru- 
ment may be misleading; the notion of teach- 
er effectiveness under lying the instrument 
may be fallacious; the coverage may be in- 
complete; key words and terms may not be 
defined or may be poorly defined; the direc- 
tions for the use ofthe instrument may be in- 
complete or ambigious; the means of record- 
ing data, scores, etc., may be inadequate; 
the separationof data gathering and evaluat- 
ing processes may not be clearly indicated; 
the sampling of behavior may be inadequate; 
to mention only a few of the possible short- 
comings that may be found in the data-gath- 
ering devices themselves. But there are 
other dangers: some instruments, no matter 
how good in and of themselves, are danger- 
ous in the hands of some people because of 
the lack of professional sophistication, be- 
cause of deep-seated preconceived convic- 
tions that may be erroneous, and because of 
willful falsifications of data that may arise 
out of personal incompatabilities; and teach- 
ers vary in effectiveness from time to time 
and under different conditions. 





16. Within, and cutting across the foregoing sug- 


gestions, there are four major considera- 

tions that must be kept in mind: 

A. Teacher acts are not good or bad in gen- 
eral but only incontext of purposes, per- 
sons, and situations. They may be em- 
ployed in operational definitions of impor- 
tant constituents of effectiveness and as 
data for making inferences about personal 
fitness and professional competencies, 
but not as means of distinguishing good 
teaching from poor teaching in and of them- 
selves. 

. The constituents of effectiveness are not 
found in teachers, or in pupils, or in sit- 
uations, but in the relationships that exist 
among those at any given time and place. 
The learning-teaching situation is a dy- 
namic situation and must be so viewed. 

. Current attempts to evaluate teacher ef- 
fectiveness deal with certain types of re- 
alities that must be given consideration, 
such, for example, as the perceptions of 
teachers, pupils, parents, and adminis- 
trators of what goes on and under what 
conditions. It isnot enough to know mere- 
ly what is, but it is equally important that 
we know what people think is. 

. Many people have expectancies relative to 
teaching: other teachers, supervisors, 
administrators, pupils, parents, board 
members, etc., andthese expectancies 
must be given careful consideration in each 
particular learning and teaching situation. 


Only a few of the more important considera- 


tions have been here summarized. 





STUDIES SUMMARIZED 


Anderson, Harold M., ‘‘ A Study of Certain 
Criteria of Teaching Effectiveness, ’’ Jour- 
nal of Experimental Education, XXIII (Sept. 
1954), pp. 41-77. 

Bach, JacobO., ‘‘Practice Teaching Success 
in Relationto Other Measures of Teaching 
Ability, ’’ Journal of Experimental Educa- 
tion, XXI (Sept. 1952), pp. 57-80. 

Barr, A.S., Characteristic Differences in 
the Teaching Perfor mance of Good and 
Poor Teachers of the Social Studies 
(Bloomington, Ill. : Public School Publish- 
ing Co., 1929), 127 pp. 

Barr, A.S., An Introduction to the Scientific 
Study of Classroom Supervision (New York: 
D. Appleton and Co., 1931), pp. 295-365. 

Barr, A.S., ‘*The Measurementand Predic- 
tion of Teaching Efficiency: A Summary of 
Investigations, ’? Journal of Experimental 
Education, XVI (June 1948),pp. 203-283. 

Barr, A.S., ‘‘The Assessment of the Teach- 
er’s Personality, ’’ School Review, LXIX 
(Winter 1961), pp. 400-408. 

Barr, A.S. and Emans, Lester M., ‘‘What 
Qualities are Prerequisite to Success in 
Teaching ?’’ Nations Schools, VI (Sept. 
1930), pp. 60-64. 

Barr, A.S., and others, ‘‘The Validity of 
CertainInstruments Employed in the Meas- 
urement of Teaching Ability,’ in The Meas- 
urement of Teaching Efficiency, edited by 
Helen M. Walker (New York: Macmillan 
Co., 1935). 

Best, John Wesley, A Study of Certain Sel - 
ected Factors Underlying the Choice of 
Teaching as a Profession, unpublished 
Ph. D. Dissertation, University of Wiscon- 
sin, 1948. 

Blum, L. P., ‘‘Comparative Study of Stud- 
ents Preparing for Five Selected Profes- 
sions Including Teaching, ’’ Journalof Ex- 
perimental Education, XVI (Sept. 1947), 
pp. 31-65. 

Bollinger, Russell V. The Social Impactof 
the Teacher on Pupils, unpublished Ph. D. 
Dissertation, University ofWisconsin 
1939. 

Brookover, Wilbur B., ‘‘The Relation of 
Social Factors to Teaching Ability,’’ Jour- 

i XIII (June 
























































r 
1945), pp. 191-205 

Brandt, Willard J. A Follow-up of Some Ear- 
lier WisconsinStudi ili 
unpublished Ph. D Dissertation, University 
of Wisconsin, 1949. 





Briggs, Edgar V. A Follow-up Study of a 
Group of University of Wisconsin Grad- 
uates Ten Years After Graduation, unpub- 
lished Ph.D Dissertation, University of 
Wisconsin, 1954. 

Butler, FrankA. StandardItems to Observe 
for the Improvement of Classroom Man- 
agement, unpublished Ph. D. Dissertation, 
University of Wisconsin, 1928. 

















. Butler, F. A., ‘‘Standard Items to Observe 


for theImprovement of Teaching in Class- 
room Management,’’ Journal of Education- 
al Method, IX (1930), pp. 517-527. 

Carlson, Gustave E. , Characteristic Differ 
ences Between Good and Poor Teachers, 
unpublished Ph. D. Dissertation , Univ- 
ersity of Wisconsin, 1942. 

Cash, Christine B. , AStudy of School-Com- 
munity Relations in Selected High Schools 
of East Texas, unpublished Ph. D. Disser- 
tation, University-of Wisconsin, 1947. 

Doane, Kenneth R., A Study of the Profes- 
sional Curriculum Requirements for the 
Preparation of High School Teachers in 
the United States, unpublished Ph. D. Dis- 
sertation, University of Wisconsin, 1947. 

Douglas, Lowell N. , AStudy of Certain Fac- 
ters withSpecial Reference to Health, un- 
published Ph. D. Dissertation, Univer sity 
of Wisconsin, 1939. 

Douglas, Lois, ‘‘The Pre-training Selection 
of Teachers, ’’ Journal of Educational Re- 
search, XXVIII (Oct. 1934), pp. 92-117. 

Draves, DavidD. , AStudy of Class Size and 
Instructional Methods, unpublished Ph. D. 
Dissertation, University of Wisconsin 
1957. 

Eichsteadt, A.C., Factors Associated with 
the Development and Non-development of 
Primary Patterns of the Strong Vocational 
Interest for Men, unpublished Ph. D. Dis- 
sertation, University of Wisconsin, 1949. 

Emans, Lester M., In-service Education of 
Teachers, unpublished Ph. D. Disser- 
tation, University of Wisconsin, 1947. 

Erickson, Harley E., ‘‘A Factorial Study of 
Teaching Ability,’’ Journalof Experimen- 
tal Education, XXIII (Sept. 1954), pp. 1- 
39 


Eustice, David E., Experience Background 
on Teaching Success, unpublished Ph. D. 
Dissertation, University ofWisconsin 
1961. 

Flanagan, CarrollEdward, ‘‘The Predictive 
Value of the Minnesota Multiphasic Inven- 



























































’ 





JOURNAL OF EXPERIMENTAL EDUCATION 


tory,’’ Journal of Experimental Education, 

XXIX (June 1961). 

. Flemming, Thomas F., AStudy of Trends in 

Interest and Support of Teacher Welfare, 

unpublished Ph. D. Dissertation, Univer- 

sity of Wisconsin 1951. 

. Goldgruber, John J., A Study of the Pre- 

service Education of Teachers Graduated 

from Three Wisconsin Teacher Training 

Institutions, unpublished Ph. D. Disserta- 

tion, University of Wisconsin, 1957. 

. Golden, Melvin, Behaviors Related to Effec- 
tive Teaching, unpublished Ph. D. Disser- 
tation, University of Wisconsin, 1957. 

Gotham, R.E., ‘‘Personality and Teaching 
Efficiency, ’’ Journal of Experimental Ed- 
ucation, XIV (Dec. 1945), pp. 157-165. 

Grotke, Earl M., ‘‘A Study of Professional 
Distances Between the Raters of Teachers 
and the Teachers Rated,’’ Journal of Ex- 
perimental Education, XXIV (Sept. 1955), 
pp. 1-41. 

. Guiles, Roger E.,AStudy of Practices, Con- 
ditions, and Trends in Relation to the Func- 
tion of Wisconsin State Teachers Colleges, 
unpublished Ph. D. Dissertation, University 
of Wisconsin, 1947. 

Hampton, Nellie D.,‘*An Analysis of Super- 
visory Ratings of Elementary Teach- 
ers Graduate from Iowa State Teachers 
College , ’’ Journal of Experimental Ed- 


ucation, XX (Dec. 1951), pp. 179-216. 
Hellfritzsch, A.G., “‘A Factor Analysis of 


Teachers’ Abilities,’’ Journal of Experi- 
mental Education, XIV (Dec. 1945), pp. 
166-199. 

Hult, Esther, ‘‘Study of Achievement in Ed- 
ucational Psychology, ’’ Journal of Experi- 
mental Education, XIII (June,1945),pp, 174 
-190. 

Jackson, Virgil D., The Measurement of So- 
cial Proficiency, unpublished Ph. D. Dis- 
ertation, University of Wisconsin, 1939. 

Jayne, C.D. ‘*A Study of the Relations in Be- 
tween Teaching Procedures and Educatior 
al Outcomes ,’’ Journal of Exper imental 
Education, XIV (Dec. 1945), pp.101-134. 

. Johnson, Alfred H. ‘‘The Responses of High 
School Seniors toa Set of Structured Situ- 
ations Concerning Teaching as a Career, ”’ 
Journal of Experimental Education, XXVI 
(June, 1958) pp. 263-314. 

Johnson, Carl, The Measurement of Teach- 
ing Ability. Unpublished Ph.D. Disserta- 
tion, University of Wisconsin, 1932. 

Jones, Agnes, A Follow-up Study of Begin- 
ning HomeEconomics Teachers Grad- 
uated from the University of Wisconsin to 
Ascertain Educational Needs, unpublished 
Ph.D. Dissertation, University of Wiscon- 
Sin, 1954. 

Jones, Margaret Lois, ‘‘Analysis of Certain 

























































































55. 


Aspects of Teaching Ability, ’’ Journal of 
Experimental Education, XXV (Dec. 1956), 
pp. 103-180. 

Jones, Ronald D.,‘*The Prediction of Teach- 
ing Efficiency from Objective Measures,’’ 
Journal of Experimental Education, XV 
(1946), pp. 85-99. 

Kline, Francis, Satisfactions and Annoy- 
ances in Teaching, unpublished Ph. D. Dis- 
sertation, University of Wisconsin, 1949. 

Knox, William D., ‘*‘A Study of the Relation- 
ships of Certain Environmental Factors to 
Teaching Success, ’’ Journal of Experi- 
mental Education , XXV, (Dec. 1956), pp. 
95-151. 

La Duke, C. V.,‘‘The Measurement of Teach- 
ing Ability, ’’ Journal of Experimental Ed- 
ucation, XIV (Sept. 1945), pp. 75-100. 

Lamke, Tom A.,‘‘Personality and Teaching 
Success, ’’ Journal of Experimental Educa- 
tion, XX(Dec. 1951), pp. 217-260. 

Lange, Phil C.,‘‘A Study of Concepts Devel- 
oped by Students in an Undergraduate 
Course in the Psychology and Practice of 
Teaching, ’’’ Journal of Educational Re- 
search, XXXVI (May, 1943), pp. 641-661 

Lien, Arnold J.,‘‘A Comparative-predictive 
Study of Students in the Four Curricula of 
a Teacher Education Institution,’’ Journal 
ef Experimental Education, XXI (Dec. 
1952), pp. 81-219. 

Lins, Leo Joseph, ‘*‘The Prediction of Teach- 
ing Efficiency, ’’ Journal of Experimental 
Education, XV (Sept. 1946), pp. 2-60. 

Lyon, VirgilE. The Validity of Certain In- 
struments Employed in the Measurement 
of Teaching Ability, unpublished Ph. D. 
Dissertation, University of Wisconsin, 
1932. 

Mann, Mary (Sister Jacinta) Activities and 
Success of University of Wisconsin Grad- 
uates, unpublished Ph. D. Dissertation, 
University of Wisconsin, 1959. 

Manwiller, Lloyd V. ‘‘Expectations Regard- 
ing Teachers,’’ Journal of Experimental 
Education, XXVI (June, 1958), pp. 315- 
354. 

Martindale, Frank E. ‘‘Situational Factors 
in Teacher Placement and Success.’’Jour- 
nal of Experimental Education, XX (Dec. 
1951), pp. 121-178. 

Mathews, Lee H. *‘AnItem Analysis of 
Measures of Teaching Ability, Journal 
of Educational Research, XXXII (1940), 
pp. 576-580. 

McCoard, W.B. ‘‘Speech Factors as Related 


to Teaching Efficiency.’’ Speech Mono- 
graphs, XI (1944), pp. 53-64. 


















































55a. Midthun, M.A., The objectivity ofan Activ- 





ity Check Listfor the Study and Improve- 


ment of Teaching, Master’s Dissertation, 
University of Wisconsin, 1928. 











STUDIES SUMMARIZED 


Mitchell, Roy A., The Curriculum for Sec- 
ondary School Teachers, unpublished Ph. 
D. Dissertation, University of Wisconsin. 

Montross, Harold W., ‘*Temperment and 
Teaching Success. ’’ Journal of Experimer 
tal Education, XXIII (Sept. 1954), pp. 41- 
71. 

- Nemec, L. G., Relationships Between 
Teacher Certification and Education in 
Wisconsin: A Study of Their Effects on Be- 
ginning Teachers, ’’ Journal of Experimen- 
tal Education, XV (1946), pp. 101-132. 

Pitts, Gaylord E., An Experimental Study of 
the Effectiveness of Different Methods of 
Organizing and Directing Student Experi- 
ences, unpublished Ph. D. Dissertation, 
University of Wisconsin, 1942. 

Plumb, Valworth R., The Prediction of Ac- 
ademic Success at the University of Wis- 
consin, Duluth Branch, unpublished Doc- 
tor’s Dissertation, University of Wiscon- 
Sin, 1950. 

. Reitz, William and Others., Admission Status 
of 2188 Applicants for Teacher Education, 
Research Studies in Admission and Place- 
ment, No. 2 (Detroit: College of Education, 
Wayne University, 1950). 

Rieck, Elmer C., ‘The Comparison of 
Teachers’ Response Patterns on the Min- 
nesota Multiphasic Personality Inventory,’’ 
Journal of Experimental Education, XXIX 
(June 1961), pp. 355-372. 

. Riesch, Kenneth P. , A Study of Some Factors 
in Pupil Growth, unpublished Ph. D. Dis- 
sertation, University of Wisconsin, 1948. 

Ringness, Thomas A., ‘‘Relationships Be- 
tween Certain Attitudes Toward Teaching 
Success, ’’ Journal of Experimental Educa- 
tion, XXI (Sept. 1952), pp. 1-55. 

Rolfe, J. F., ‘*The Measurement of Teaching 
Ability, Study No. 1.’’ Journal of Experi- 
mental Education, XIV (Sept. 1945), pp. 
52-74. 

Rostker, L. E. , ‘*The Measurement of Teach- 
ing Ability, Study No.1’? Journal of Ex- 
perimental Education, XIV (Sept. 1945), 
pp. 6-51. 

Rudisill, Mable, Personality and Teacher 
Success, unpublished Ph. D. Dissertation, 
University of Wisconsin, 1931. 

. Russell, Leila, Characteristics of Under- 

graduates at the University of Wisconsin 

Training to be Teachers, unpublished Ph. 

D. Dissertation, University of Wisconsin. 

. Schick, George J., The Predictive Value of 

a Teacher Judgment Test, unpublished Ph. 

D. Dissertation, University of Wisconsin, 

1957. 

. Schmid, John Jr., ‘‘Factor Analysis of Pro- 
















































































spective Teachers’ Differences,’’ Journal 


of Experimental Education, XVIII (June, 








1950), pp. 287-321. 


70a. Schoonover, H. F. , AStudy of the Objectivity 


71. 





of a Teacher Check List. Bachelors Dis- 
sertation, University of Wisconsin, 1927. 

Schwahn, Wilson E., A Study of Certain As- 
pects of Teacher Education in Wisconsin, 
unpublished Ph. D. Dissertation, Univer- 
sity of Wisconsin, 1956. 











. Schwartz, Anthony N. ‘‘A Study of the Dis- 


criminatory Efficiency of Certain Tests of 
the Primary Source Personality Traits of 
Teachers, ’’ Journal of Experimental Ed- 
ucation, XIX (Sept. 1950), pp. 63-93. 





.Sigurdson, Sigurd, The Reliability of an 





Activity Check List for the Study and Im- 
provement of Teaching. Bachelor’s Dis- 
sertation, University of Wisconsin, 1927. 








. Singer, Arthur, ‘‘Social Competence and 


Success in Teaching, ’’ Journal of Experi- 
mental Education, XXIII (Dec. 1954), pp. 
91-131. 








. Stevens, Leila., The Attitude of High School 





Seniors Toward Teaching asa Career, un- 
published Ph. D. Dissertation, University 
of Wisconsin, 1954. 





. Stoelting, Gustave J.,“*The Selection of 


Candidates for Teacher Education at the 
University of Wisconsin. ’’ Journal of Ex- 
perimental Education, XXIV (Dec. 1955), 
pp. 115-140. 








. Struck, L.A., The Reliability of an Activity 





Check List for the Study and Improvement 
of Teaching, Master’s Dissertation, Uni- 
versity of Wisconsin, 1928. 

Thiede, Wilson B., ‘‘Some Characteristics 
of Juniors Enrolled in Selected Curricula 
at the University of Wisconsin,’’ Journal 
of Experimental Education, XIX (Sept. 
1950), pp. 1-62. 

Tiedeman, Stuart C., ‘‘A Study of Teacher 
Pupil Relationships, ’’ Journal of Educa- 
tional Research, XXXV (May, 1942), pp. 
657-664. 

Torgerson, T.L., The Measurement of 
Teaching Ability, unpublished Ph.D. Dis- 
sertation, University of Wisconsin, 1930. 




















. Torrence, Andrew P., ‘‘A Study of the Re- 


lationships of Certain Competencies to Suc- 
cess in Teaching Vocational Agriculture, ’’ 
Journal of Experimental Education, XXV 
(Sept. 1956), pp. 1-31. 

Vertain, J. Dale., The Personal, Social, 
and Intellectual Characteristics of Under- 
graduates Training to be Teachers at a 
Wisconsin State College, unpublished Ph. 
D. Dissertation, University of Wisconsin, 
1960. 

















- Von Eschen, C. R., ‘‘The Improvability of 
Teachers in Service, ’’ Journal of Experi- 
mental Education, XIV (1945), pp. 135-156. 
. Von Haden, Herbert I., ‘‘An Evaluation of 











JOURNAL OF EXPERIMENTAL EDUCATION 


Certain Types of Personal Data Employed 
in the Prediction of Teaching Efficiency, ’’ 
Journal of Experimental Educ.ation, XV 
(Sept. 1946), pp. 61-84. 

83. Walvoord, AnthonyC., The Validity of Cer- 





tain Instruments Employed in the Meas- 





urement of Teaching Ability, unpublished 








Ph. D. Dissertation, University of Wiscon- 
sin, 1932. 





READ THE JOURNAL OF EDUCATIONAL RESEARCH MONTHLY 


For forty years The Journal of Educational Research has been the “most: 
used"’ journal by students of education at all levels. Now this distinguished 
publication is bigger and better than ever—with a new monthly overview 
custom designed to give you rapid orientation and insight. Send a postcard 
today for a free inspection copy and full information. 


The Journal of Educational Research 


7 personal COPY On You! desk: The mark ‘of the “pro” in education! 


Dembar Publications, Box 1605, Madison 3, Wisconsin 








oF 
ee 
= 
aero eea mmenmmns 
: 4a 
, 
F 
: ; 
s 
7 
4 
‘ 
; 
. 
- | 
: 
. , a 
- 
ik. : 






7 — baad 
ow AF 7 . 
oa t 
\ Last Lie 7 
* 
« . —-_ : 
$ . 
- os ~ ah: 
. : ’ 
7 2 
. 
7 — 
° 
re 
es 
. 
7 
_ 
* 
‘ 
‘ 
° 
——— 
. 
. = v 
. 
. 
ae 2 7 
ey fa) le < 
+ ‘ 
ie. = 7 
: \ 
’ 
Fi > : 
es : =- = 
SAT URI seein "4 
a 7, 7 : ; y 
ee ee es ee 
; : _ me 
a. 
.. 4 : 
a i 
wo + 
> 











accrmsrsene 
- = ; 
. 
. ‘ 
> 
; wal 
= > 
é ‘ - BS 7 
. ’ 
: 
. a 
i 
. 
° z 
7 . 
. . . 
; hm ~ 4 ee 
7 ‘ 
. 
. 
= Ps 
“e « . 
wT - 
— 
7 4 
: Fy 
: - 
4 
- : 
‘ 
: 7 .* . ? 
al » 
7 ; 
‘ 
i 
4 cen 
ote 
F 7 
‘ 
. 


